Cell Physiology

Large-scale phosphotyrosine proteomic profiling of rat renal collecting duct epithelium reveals predominance of proteins involved in cell polarity determination

Boyang Zhao, Mark A. Knepper, Chung-Lin Chou, Trairak Pisitkun


Although extensive phosphoproteomic information is available for renal epithelial cells, previous emphasis has been on phosphorylation of serines and threonines with little focus on tyrosine phosphorylation. Here we have carried out large-scale identification of phosphotyrosine sites in pervanadate-treated native inner medullary collecting ducts of rat, with a view towards identification of physiological processes in epithelial cells that are potentially regulated by tyrosine phosphorylation. The method combined antibody-based affinity purification of tyrosine phosphorylated peptides coupled with immobilized metal ion chromatography to enrich tyrosine phosphopeptides, which were identified by LC-MS/MS. A total of 418 unique tyrosine phosphorylation sites in 273 proteins were identified. A large fraction of these sites have not been previously reported on standard phosphoproteomic databases. All results are accessible via an online database: http://helixweb.nih.gov/ESBL/Database/iPY/. Analysis of surrounding sequences revealed four overrepresented motifs: [D/E]xxY*, Y*xxP, DY*, and Y*E, where the asterisk symbol indicates the site of phosphorylation. These motifs plus contextual information, integrated using the NetworKIN tool, suggest that the protein tyrosine kinases involved include members of the insulin- and ephrin-receptor kinase families. Analysis of the gene ontology (GO) terms and KEGG pathways whose protein elements are overrepresented in our data set point to structures involved in epithelial cell-cell and cell-matrix interactions (“adherens junction,” “tight junction,” and “focal adhesion”) and to components of the actin cytoskeleton as major sites of tyrosine phosphorylation in these cells. In general, these findings mesh well with evidence that tyrosine phosphorylation plays a key role in epithelial polarity determination.

  • mass spectrometry
  • kidney
  • adherens junction
  • tight junction
  • cytoskeleton

global phosphoproteomic profiling is providing important new information regarding the signaling pathways involved in functional regulation in renal epithelia (3, 11, 20). The general strategy is to isolate the cells or tissue of interest, extract proteins, proteolyze the proteins with trypsin, and select phosphopeptides using chromatographic techniques, followed by large-scale peptide identification and sequencing using protein mass spectrometry (LC-MS/MS) (25, 41). In general, three amino acid moieties are phosphorylated in eukaryotic tissues, viz. serine, threonine, and tyrosine. The techniques employed in renal epithelial phosphoproteomic analysis, however, have been tilted toward serine and threonine, and not tyrosine phosphorylation. For example, in the current inner medullary collecting duct (IMCD) phosphoproteome database (3) (http://dir.nhlbi.nih.gov/papers/lkem/mpkccdprot/), <2% of the identified phosphorylation sites are on tyrosines. Nevertheless, tyrosine phosphorylation is physiologically important in epithelia in the action of growth factor receptors and in cell adhesion-associated signaling, which are instrumental in the determination of the state of cell differentiation and polarity. Consequently, it was of interest to adapt phosphoproteomics techniques developed for the study of tyrosine phosphorylation in cancer cells (51) to investigation of signaling in the renal IMCD. Such techniques, based on the use of phosphotyrosine-specific antibodies for peptide immunoprecipitation, have allowed the identification and quantification of hundreds of tyrosine phosphopeptides (33, 69). The goal of the present studies was to utilize such approaches to expand the number of tyrosine phosphorylation sites known to be present in native IMCD cells, thereby enhancing the size and value of our IMCD phosphoproteome database. To do this, we treated IMCD cells with two general protein tyrosine phosphatase (PTP) inhibitors, vanadate and pervanadate, to maximize the number of phosphorylation site identifications. In the “systems biology paradigm,” proteomic databases, such as the one reported here, can provide a basis for hypothesis generation and meta-analysis by researchers in the renal physiology community. The results yielded 503 unique phosphotyrosine-containing peptides, corresponding to 418 tyrosine phosphorylation sites. These sites are shown by computational analyses to be associated with pathways involved in regulation of adherens junctions, tight junctions, and the actin cytoskeleton.1


Animals and IMCD preparation.

Animal procedures were approved by the National Heart, Lung, and Blood Institute Animal Care and Use Committee (no. H-0110). Male Sprague-Dawley rats (200–300 g) were injected intraperitoneally with furosemide (5 mg/rat) 20 min before decapitation and removal of kidneys. Furosemide dissipates the hypertonic medullary osmolality toward plasma osmolality level (∼290 mosmol/kgH2O), thereby preventing osmotic shock to the cells upon isolation of IMCDs (54). IMCDs were isolated from renal medullas as described (8) after digestion (3 mg/ml collagenase B, 2,000 U/ml hyaluronidase, 250 mM sucrose, 10 mM triethanolamine, pH 7.6) at 37°C with continuous stirring for 75–90 min. IMCDs were sedimented by low-speed centrifugation (70 g for 20 s), separating them from the lighter non-IMCD cells. IMCD pellets were washed and sedimented twice in sucrose buffer (250 mM sucrose, 10 mM triethanolamine, pH 7.6), followed by buffer exchange into 290 mosmol/kgH2O bicarbonate buffer (9). In previous studies, an IMCD purity of >80% was achieved by this isolation technique (64).

Pervanadate preparation and treatment.

Preparation of pervanadate has been previously described (30, 38). Briefly, a 30 mM stock solution of pervanadate was prepared using 100 mM sodium orthovanadate (New England BioLabs, Ipswich, MA) and 3% (wt/wt) H2O2 (Fisher Scientific, Hampton, NH) mixed in 1× PBS at 2:1 molar ratio of H2O2:orthovanadate. The mixture was incubated in the dark at room temperature for 15 min. Five minutes prior to treatment, pervanadate was diluted in bicarbonate buffer (118 mM NaCl, 25 mM NaHCO3, 5 mM KCl, 4 mM Na2HPO4, 2 mM CaCl2, 1.2 mM MgSO4, 5.5 mM glucose, 5 mM acetate, gassed with 95% air-5% CO2 for 20 min before use). IMCD suspensions were treated immediately with the diluted pervanadate (final pervanadate concentration: 100 μM) to minimize decomposition of the H2O2-vanadate complex. For the comparison of effects of different treatments, the IMCD suspension was treated with 100 μM pervanadate, 1 mM vanadate, 180 μM H2O2, or 100 μM pervanadate with 100 μg/ml catalase for 10 min. After treatment, the IMCD suspensions were solubilized and denatured with lysis buffer [final concentrations: 8 M urea, 50 mM Tris·HCl, 75 mM NaCl, 1× HALT protease/phosphatase inhibitor cocktail (Thermo Scientific, Rockford, IL), 1 mM sodium orthovanadate]. Samples were sonicated on ice for 30 s. Lysates for immunoblot analysis were resuspended in Laemmli buffer while lysates for proteomic analysis were resuspended in 8 M urea, 75 mM NaCl, and 50 mM Tris·HCl. The protein concentration of the lysate was determined with the BCA assay (Pierce, Rockford, IL).


Antiphosphotyrosine monoclonal mouse PY100 (Cell Signaling Technology, Danvers, MA) and PY66 (Sigma-Aldrich, St. Louis, MO) antibodies were used for immunoblotting and immunoprecipitation. The species-specific secondary antibodies conjugated with fluorophores were obtained from Rockland Immunochemicals (Gilbertsville, PA).

Immunoblot analysis.

Immunoblotting of IMCD proteins followed procedures described by Pisitkun et al. (48). Sixteen micrograms of protein in Laemmli buffer were loaded onto a 4–20% gradient SDS-PAGE gel, and electrophoresis was performed at 200 V. Proteins were then transferred onto a nitrocellulose membrane (0.2 μm pore size) under 80 V for 45 min. After incubating in Odyssey Blocking Buffer (LI-COR, Lincoln, NE) for 1 h, primary antibody was added to the membrane and the membrane was incubated overnight. The membrane was washed three times using 1× PBS with 0.1% Tween-20 followed by the application of secondary antibody for 1 h. The membrane was washed three times with 1× PBS with 0.1% Tween-20 followed by a final rinse with 1× PBS. The protein bands on the membrane were scanned using the LI-COR Odyssey Scanner and further analyzed with Odyssey software v2.1.

In-solution trypsin digestion.

Reduction, alkylation, and trypsinization were performed as previously described (25) with modifications. Samples were reduced with 10 mM DTT for 1 h at 55°C and alkylated with 40 mM iodoacetamide for 1 h in the dark at room temperature. Unreacted iodoacetamide was quenched with 40 mM DTT, and the solution was incubated for at least 15 min. Samples were diluted to <1 M urea with 25 mM ammonium bicarbonate and digested overnight at 37°C with trypsin [1:40 (wt/wt)]. Samples were acidified with 0.5% formic acid and spun at 1,000 g for 10 min at 4°C. Finally, the samples were desalted using Oasis 1cc HLB columns (Waters, Milford, MA).

Peptide immunoprecipitation and immobilized metal affinity chromatography.

Phosphotyrosine peptide enrichment was performed with antiphosphotyrosine peptide immunoprecipitation as described by White and colleagues (69) with modifications. A protein G agarose bead slurry (20 μl, IP04, Calbiochem, EMD Chemicals, Darmstadt, Germany) was mixed with 200 μl IP buffer (100 mM Tris, 0.3% Nonidet P-40, pH 7.4) and 12 μg of each antiphosphotyrosine antibody (PY100 and PY66) and incubated for 8 h at 4°C. The beads were washed with IP buffer. Samples, resuspended in 400 μl IP buffer, were added to the beads and incubated overnight at 4°C. The beads were washed three times with rinse buffer (100 mM Tris, pH 7.4), and the phosphotyrosine peptides were eluted from the antibody with 200 μl of elution buffer (100 mM glycine, pH 2.5) for 30 min at room temperature.

Phosphopeptide enrichment using Ga3+ immobilized metal affinity chromatography (IMAC) (Phosphopeptide Isolation Kit, Pierce) was performed as previously described (25) either before or after the immunoprecipitation procedure. For IMAC followed by immunoprecipitation, an extra step of desalting with a Graphite Spin Column (Pierce, Thermo Scientific) to remove excess ammonium bicarbonate between the two enrichment procedures was performed. Enriched peptides were desalted with PepClean C-18 Spin Columns (Pierce, Thermo Scientific) and resuspended in 0.1% formic acid.

LC-MS/MS analysis.

All samples were analyzed on an LTQ-Orbitrap XL mass spectrometer (Thermo Scientific) interfaced with a NanoLC-1D Plus system (Eksigent Technologies). Fragmentation of peptide ions was achieved via collision-induced dissociation. Samples were loaded onto an Agilent Zorbax 300SB-C18 trap column [0.3 mm inner diameter (ID) × 5 mm length, 5-μm particle size] at a flow rate of 10 μl/min for 15 min. Reversed-phase C18 chromatographic separation of trapped peptides was carried out on a prepacked BetaBasic C18 PicoFrit column (75 μm ID × 10 cm length; New Objective) at 300 nl/min using the following gradient: 2–5% solvent B (balance, solvent A) for 2 min; 5–45% solvent B for 45 min; 45–50% solvent B for 5 min; 50–95% solvent B for 5 min (solvent A: 0.1% formic acid in 98% water, 2% acetonitrile; solvent B: 0.1% formic acid in 100% acetonitrile). A data-dependent mode was employed such that a single survey MS1 scan for precursor ions was followed by six data-dependent MS2 scans. Survey MS scans were acquired in the Orbitrap component with a resolution of 30,000, and MS2 scanning was performed in the linear ion trap.

Data searching and scoring.

Searches were performed using the latest version of the rat RefSeq database [National Center for Biotechnology Information (NCBI)] with concatenated forward and reversed sequences to allow for target-decoy analysis. The database also contained sequences for common MS contaminants, such as human keratin and porcine trypsin. Spectra were searched with SEQUEST (18), InsPecT (63), and OMSSA (22). MS searches were aided by the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health (http://biowulf.nih.gov). Spectra were filtered to obtain a false discovery rate (FDR) of 1% as described previously (26). Phosphorylation site localization was performed using PhosphoScore (52) for the SEQUEST data and Phosphate Localization Score for the InsPecT data (2). Venn diagrams for visualizing the results of the three search algorithms were generated using Venn Diagram Plotter v1.4.3740. The online database PhosphoSitePlus (28) (http://www.phosphosite.org/) was used to determine whether phosphorylation sites identified in this study had been previously archived.

Phosphorylation site functional analyses.

The conservation of the phosphorylated amino acids among multiple mammalian species was determined using the NCBI HomoloGene database v64 for obtaining the orthologous proteins and ClustalW2 v2.0.12 for aligning the sequences. Phosphorylation motifs were extracted using Motif-X v1.2 (56) (http://motif-x.med.harvard.edu/motif-x.html). Only peptides containing phosphorylated tyrosine were analyzed, and the sequences were prealigned with in-house software before submission to Motif-X. Sequence logos were generated with Weblogo (10) (http://weblogo.berkeley.edu/). Kinase predictions were performed using NetworKIN v2.0 (31, 32) (http://networkin.info/) with phosphoprotein sequence and corresponding tyrosine phosphorylation sites as inputs. The human database was used because this was the database available with the greatest homology to Rattus norvegicus. Prediction is applicable to non-human species because the NetworKIN algorithm considers both consensus motifs and context data that are expected to be conserved through evolution (62). A cutoff NetworKIN score of 0.7 was used. The predicted kinases were cross-referenced with the IMCD transcriptome (http://dir.nhlbi.nih.gov/papers/lkem/imcdtr/).

Gene ontology and network analyses.

Gene ontology (GO) enrichment was assessed and visualized with the Biological Network Gene Ontology (BiNGO) plugin v2.44 (34) for Cytoscape v2.8.1 (58). The level of enrichment was assessed by hypergeometric distribution tests. The P value was corrected for multiple hypothesis testing with the Benjamini-Hochberg FDR algorithm (4), implemented in BiNGO. Enriched “Biological Process,” “Molecular Function,” and “Cellular Component” terms with a significance level of < 0.01 after correction were used in the visualization. Functional clustering was performed with Functional Annotation Clustering in DAVID v6.7 (12, 29) (http://david.abcc.ncifcrf.gov/) using the following databases: GO FAT, KEGG pathway, SMART, InterPro, Swiss-Pro/Protein Information Resource Keywords, and UniProt Sequence Features. Enrichment score was reported as the minus log transformation of the geometric mean of P values (modified Fisher's exact test). The list of proteins for enriched GO terms and list of enriched KEGG pathways were also generated with DAVID. The level of enrichment was assessed by modified Fischer's exact test (EASE). The P value was corrected for multiple hypothesis testing with the Benjamini-Hochberg FDR algorithm. Terms with a corrected P < 0.01 were considered significant. For the control, the proteins corresponding to the nonphosphorylated peptides identified in our mass spectrometry analysis were used. For both BiNGO and DAVID analyses, the reference set used was the IMCD transcriptome (http://dir.nhlbi.nih.gov/papers/lkem/imcdtr/). The phosphotyrosine data set was also imported into Ingenuity Pathway Analysis (IPA) v9.0 (Ingenuity Systems, Redwood City, CA) to identify top canonical pathways containing enriched components.


Tyrosine phosphatase inhibition.

The objective of this paper was to identify protein substrates of tyrosine kinases in IMCD cells and to localize the sites of phosphorylation. To maximize the identifications, we used tyrosine phosphatase inhibitors. Vanadate and pervanadate are two widely used general PTP inhibitors. In previous studies, pervanadate, generated as a complex of vanadate and H2O2, was found to be an irreversible inhibitor of PTP and to be a much more potent inhibitor than vanadate in raising the cellular tyrosine phosphorylation level (30, 49, 57). Our findings in IMCD cells agree with these observations (Fig. 1). Protein tyrosine phosphorylation was not augmented with the no-treatment control and treatment with H2O2 or vanadate alone but was markedly increased upon pervanadate treatment in the absence or presence of catalase. Thus, the IMCD suspensions were treated with pervanadate for the rest of our study.

Fig. 1.

Immunoblot analysis illustrating effective tyrosine phosphorylation enrichment with pervanadate treatment on rat inner medullary collecting duct (IMCD) samples. IMCD suspension was incubated with 100 μM pervanadate, 1 mM vanadate, 180 μM H2O2, or 100 μM pervanadate + 100 μg/ml catalase for 10 min. A no-treatment control was also used. Equal amounts of lysate (16 μg) were loaded into each well for immunoblot analysis using two antiphosphotyrosine antibodies, PY100 (A) and PY66 (B).

Phosphotyrosine enrichment.

Phosphotyrosine-containing peptides were enriched using antiphosphotyrosine immunoprecipitation. This technique has been successfully used for studying tyrosine phosphorylation dynamics in epidermal growth factor receptor signaling networks (69). The combination of immunoprecipitation and IMAC on peptides from pervanadate-treated IMCD proteins was found to be effective in enriching phosphotyrosine with high specificity. In addition, the ordering of immunoprecipitation and IMAC was assessed. Although IMAC followed by immunoprecipitation yielded higher numbers of peptides, the results from both approaches were combined to maximize the yield. Altogether, these results demonstrate that the enrichment techniques described succeeded at enriching tyrosine phosphorylation in IMCD.

Phosphoproteomic profiling of inner medullary collecting duct.

Inner medullary collecting ducts were subjected to phosphoproteomic analysis as described in methods. Peptide samples were analyzed on an LTQ Orbitrap XL mass spectrometer, and the resulting spectra were searched using three different search algorithms—SEQUEST, InsPecT, and OMSSA—to maximize the number of phosphopeptide identifications from high-quality spectra (3, 20, 23, 50). The data set was filtered using a target-decoy approach (16, 17) to limit false-positive identifications to <1%. A total of 2,034 peptides were identified, of which 1,044 were nonphosphopeptides and 990 were phosphopeptides (Fig. 2). A total of 912 phosphotyrosine-containing peptides, corresponding to 503 unique peptides and 273 proteins, were identified by the combination of the three search algorithms. These peptides contain 453 unique phosphorylation sites, of which 418 were unique tyrosine phosphorylation sites (see Supplemental Table S1; Supplemental Material for this article is available online at the Journal website). Approximately 19% of the tyrosine phosphorylation sites identified were not found in the PhosphoSitePlus database and are thus considered novel. Annotated phosphopeptide data from all searches are accessible online via http://helixweb.nih.gov/ESBL/Database/iPY/. A representative mass spectrum (showing Y1027 phosphorylation in the tight junction protein ZO-1) is shown in Fig. 3.

Fig. 2.

Number of nonphosphopeptides, phosphoserine/threonine peptides, and phosphotyrosine peptides identified in rat IMCD samples.

Fig. 3.

Typical mass spectrum showing identification of phosphorylation of Y1027 in the tight junction protein ZO-1. m/z, mass-to-charge ratio.

Computational analyses of phosphotyrosine peptides.

We analyzed the level of conservation for each tyrosine phosphorylation site among mammalian species other than rat (Supplemental Table S2). A total of 256 phosphorylation sites for which the corresponding proteins have orthologs (based on HomoloGene database) were included and analyzed. Approximately 84% of sites analyzed were 100% conserved among species examined, while 12% (50–99% of species) were moderately conserved and 5% (<50% of species) were poorly conserved (Fig. 4A). The degree of conservation for the tyrosine phosphorylation site was also higher than those in the neighboring amino acids, for which the percentage of sites that are 100% conserved ranged from 70% to 80% (Fig. 4B). Overrepresented motifs were also identified in sequences flanking the phosphorylation sites using Motif-X (Fig. 4C). Four motifs were identified: [D/E]xxY*, Y*xxP, DY*, and Y*E, where the asterisk symbol indicates the site of phosphorylation. This suggests that multiple tyrosine kinases are responsible for the phosphorylation at the identified sites.

Fig. 4.

Conservation of tyrosine phosphorylation sites detected by MS in rat IMCD samples. A: number of sites that are highly conserved, moderately conserved, or poorly conserved among mammalian species. B: percent perfect identity (100% conservation) of neighboring amino acids of phosphorylated tyrosine residue (position 0). C: sequence logos showing overrepresented phosphorylation motifs extracted using Motif-X. D: a list of potential protein kinase families that phosphorylate these tyrosine sites as analyzed by NetworKIN.

We next used NetworKIN (32) to identify potential protein kinases that phosphorylate these tyrosine sites. The algorithm combines kinase consensus motifs, extracted from the NetPhorest atlas (39), with contextual information of the kinase and substrate in protein association networks, extracted from the STRING database (61). After cross-referencing the results with the IMCD transcriptome, this analysis reveals a large number of kinases belonging to the insulin and ephrin receptor kinase families (Fig. 4D). Other kinase groups included MAP2K, Src, Tec, Abl, EGFR, and Met. A list of all predicted kinases with the corresponding phosphoprotein substrates and phosphorylation sites is provided in Supplemental Table S3.

Functional classification enrichment.

To assess the functional roles of tyrosine phosphorylated IMCD proteins, enrichment of GO terms relative to the IMCD transcriptome (64) (http://dir.nhlbi.nih.gov/papers/lkem/imcdtr/) were analyzed. Since GO terms are categorized into different hierarchies with interdependent relationships, visualization of GO as a network helps in understanding the relationship of the lower level terms with the broader terms (Figs. 5 and 6). The network also enables the identification of functional modules/clusters. The nodes of a network represent GO terms while the directed edges describe the relationship between terms. Distinct clusters of Biological Process, Molecular Function, and Cellular Component GO terms were immediately apparent. For Cellular Component terms, the clusters relating to “cell junction” and “cytoskeleton” were enriched. Significantly enriched Molecular Function GO terms contained two clusters of terms involved with “protein tyrosine kinase activity” and “protein binding” (Fig. 5). For Biological Process, the highly enriched GO terms fall under groups that describe “cytoskeleton organization,” “cell adhesion,” “protein amino acid phosphorylation,” “transmembrane receptor protein tyrosine signaling pathway,” and “regulation of localization” (Fig. 6).

Fig. 5.

Gene Ontology (GO) Cellular Component and Molecular Function terms enrichment of IMCD phosphotyrosine proteins relative to IMCD transcriptome using BINGO. Enrichment was assessed with hypergeometric distribution tests, corrected for multiple hypothesis testing with Benjamini-Hochberg false discovery rate (FDR) algorithm. Only GO terms with P < 0.01 are displayed. Nodes represent GO terms while directed edges represent the relationship between terms. Node size is proportional to the number of IMCD proteins categorized to the GO term. Node color corresponds to P value, with the highly enriched terms shown in orange.

Fig. 6.

GO Biological Process terms enrichment of IMCD phosphotyrosine proteins relative to IMCD transcriptome using BINGO. Enrichment was assessed with hypergeometric distribution tests, corrected for multiple hypothesis testing with Benjamini-Hochberg FDR. Only GO terms with P < 0.01 are displayed. Nodes represent GO terms while directed edges represent the relationship between terms. Node size is proportional to the number of IMCD proteins categorized to the GO term. Node color corresponds to P value, with the highly enriched terms shown in orange.

We complemented our GO singular enrichment analysis above with a modular enrichment analysis approach using DAVID functional annotation clustering. The GO singular enrichment approach extracts biological meaning by determining whether each individual GO term is overrepresented in the data set. This approach may miss functional modules that are only apparent when classes of genes or terms are considered. The DAVID functional annotation clustering utilizes a set of fuzzy clustering techniques to classify the input data into functionally related groups with integration across multiple databases (see methods for details). The top seven functional clusters were “SH3 domain,” “cell junction,” “cell adhesion,” “cytoskeleton organization,” “actin binding,” “tight junction,” and “cell surface receptor linked signal transduction” (Fig. 7A). To demonstrate that the enrichment is specific to the phosphotyrosine data set, we performed the same analysis on proteins containing nonphosphorylated peptides identified in our IMCD sample. The highly enriched terms in this control data set (“glycolysis,” “vesicle,” “cytoskeleton organization,” “cofactor/coenzyme metabolic process,” etc.) (Fig. 7B) were largely different than those for the phosphotyrosine data set. The term “cytoskeleton organization” is common to both data sets because of the existence of shared proteins (23% of the proteins in the nonphosphorylated control for this term were also found in the phosphotyrosine data set). All functional clusters with enrichment score > 1.3 (corresponding to minus log of P = 0.05) were included in Supplemental Table S4.

Fig. 7.

Top seven functional annotation clusters of rat IMCD phosphotyrosine proteins (A) and nonphosphorylated proteins (B) relative to IMCD transcriptome using DAVID analysis. See methods for details on the approach and the list of databases used. Enrichment score is reported as the minus log transformation of the geometric mean of P values (modified Fisher's exact test).

On the basis of these two analyses, detailed descriptions of proteins in selected GO categories were generated using DAVID. Tables 14 include proteins related to “cell junction,” “cytoskeleton organization,” “cell surface receptor linked signal transduction,” and “protein tyrosine kinase activity,” respectively. Table 5 contains proteins with either the SH2 or SH3 domain, or both.

Signaling network analyses.

We next wanted to see whether specific pathways are enriched in the set of proteins present in our phosphotyrosine database. DAVID analysis with P value determined using a modified Fisher's exact test with Benjamini-Hochberg FDR correction and a cutoff of P < 0.01 was used. The top KEGG pathways included “adherens junction,” “tight junction,” “focal adhesion,” and “regulation of actin cytoskeleton” (Fig. 8A). As expected, our control data set with only proteins containing the nonphosphorylated peptides did not show any enriched pathways associated with cell junction. Instead, the top pathways were found to be involved in glycolysis and pyruvate metabolism (Fig. 8B).

Fig. 8.

Enriched KEGG pathways based on rat IMCD phosphotyrosine proteins (A) and nonphosphorylated proteins (B) relative to rat IMCD transcriptome using DAVID analysis. P values were determined with modified Fisher's exact test and corrected for multiple hypothesis testing with Benjamini-Hochberg FDR. Pathways with corrected P < 0.01 are presented.

To examine the signaling and protein-protein interactions involved in the enriched canonical pathways, the same phosphotyrosine data set was imported into Ingenuity Pathway Analysis. “Tight junction” and “actin cytoskeleton signaling” pathways were also found to be highly enriched in this analysis. Proteins with tyrosine phosphorylation found in our study are highlighted in these two pathways in Figs. 9 and 10. Additional enriched signaling pathways, viz. “ephrin receptor,” “insulin receptor,” and “integrin signaling,” are included in Fig. 11.

Fig. 9.

Diagram of canonical tight junction signaling pathway. Enriched phosphotyrosine proteins are shown in gray. Analysis and visualization using IPA (see methods).

Fig. 10.

Diagram of canonical actin cytoskeleton signaling pathway. Enriched phosphotyrosine proteins are shown in gray. Analysis and visualization using IPA (see methods).

Fig. 11.

Diagrams of canonical ephrin (A), insulin receptor (B), and integrin (C) signaling pathways. Enriched phosphotyrosine proteins are shown in gray. Analysis and visualization using IPA (see methods).


Systems biology approach to the study of cell physiology.

An understanding of the complex regulatory machinery of a cell is increasingly reliant on the use of a systematic and integrative approach. Cell physiology in particular involves a complicated network of interactions of proteins and other molecules. As such, a comprehensive understanding involves first, the identification and quantification of all components of this network; and second, the analysis of the functional roles and interactions of these components to elucidate how the cell functions as a whole.

The enumeration of the list of components has been greatly benefited from the recent advancements in high-throughput techniques such as DNA and oligonucleotide microarrays, deep sequencing, protein mass spectrometry, multiplex affinity-based assays, and cell sorting-based assays. We have previously used mass spectrometry to perform phosphoproteomic profiling in several cell types in the kidney (3, 20, 23, 50). Although tyrosine phosphorylation plays a critical role in cellular signaling and has been the target of dysregulation in a number of diseases, its extremely low abundance [estimated to constitute only 1.8% of the phosphoproteome (43)] makes its detection in phosphoproteomic analyses challenging. In fact, before this study, our IMCD phosphoproteome database contained only 55 unique phosphotyrosine sites (3). Therefore, in this study, we applied phosphotyrosine enrichment techniques in combination with protein mass spectrometry for large-scale profiling of tyrosine phosphorylation sites of proteins from the native epithelium of the renal IMCD. Overall, we identified 418 unique tyrosine phosphorylation sites. Over 19% of these sites were not found in the PhosphoSitePlus database and are therefore labeled “novel.” The entire database can be found in Supplemental Table S1 and is also available online at http://helixweb.nih.gov/ESBL/Database/iPY/.

Tyrosine signaling networks in IMCD epithelia.

An understanding on the functional role of phosphotyrosine-mediated signaling in IMCD epithelial cells can potentially be achieved by integrating our database with functional annotation data sets. The renal IMCD is the terminal portion of the renal tubule and the final site for adjusting urinary composition and volume. Therefore, the IMCD is critical for solute transport and in maintenance of homeostasis. Especially important are anchoring junctions such as adherens and focal adhesions that provide cell-cell and cell-extracellular matrix (ECM) interaction and attachment, as well as tight junctions that (together with anchoring junctions) establish and maintain cell polarity (7). Interestingly, the proteins found in our phosphotyrosine database are highly enriched in GO Cellular Component annotations “plasma membrane,” “cell junction,” and “cytoskeleton” (Figs. 5 and 7 and Table 1). Functional classification analyses also suggest tyrosine phosphorylation to be involved in “cytoskeletal protein binding,” “cytoskeleton organization,” “cell adhesion,” and “cell surface receptor linked signal transduction” (Figs. 57 and Tables 2 and 3). This is in agreement with the mounting evidence demonstrating the coordination of junctional proteins with signaling molecules (particularly receptor tyrosine kinases) to regulate cellular processes (37, 44). A large number of non-receptor tyrosine kinases, receptor tyrosine kinases, non-receptor tyrosine phosphatases, and receptor tyrosine phosphatases have also been identified at epithelial adherens junctions (37, 40). Not surprisingly, phosphotyrosine staining in epithelial cells are most prominent at junctional regions (35, 60).

View this table:
Table 1.

Gene Ontology Cellular Component: “cell junction”

View this table:
Table 2.

Gene Ontology Biological Process: “cytoskeleton organization”

View this table:
Table 3.

Gene Ontology Biological Process: “cell surface receptor linked signal transduction”

These enriched functions agree with the model for which cell-cell adhesion (mediated principally by E-cadherin) and cell-ECM adhesion (mediated by integrins) initiate spatial cues for the establishment of physical and molecular asymmetry in epithelial cells—a process that is largely dependent on cytoskeletal reorganization (68). This asymmetry/polarization occurs both along the apico-basal axis, which defines the apical and basolateral membrane domains, and a second axis commonly referred to as the planar cell polarity axis that is perpendicular to the apico-basal axis.

Involvement of tight junction and adherens junction in cell polarity.

An important process in establishing apico-basal polarity is in defining the apico-basal axis and forming the tight junction (65). Therefore, we expect signaling networks involved in tight junction formation to be potentially overrepresented. This was indeed observed with our phosphotyrosine data set (Figs. 7 and 8). A critical component of this network is a group of widely studied transmembrane proteins, viz. occludin, claudin, and junctional adhesion molecule (JAM). Phosphorylation of these proteins is a major mode of regulation that affects the binding affinity to adaptors such as the zonula occludens proteins ZO-1, ZO-2, and ZO-3 and subsequently tight junction assembly (19, 21, 46). Although the phosphorylation of occludin on Y342 has been previously reported according to the PhosphoSitePlus database, this site has not been well characterized. Our analyses indicate that this site is highly conserved and is predicted to be modulated by a number of different kinases, including ephrin and insulin receptors and MAP2K family of kinases (Supplemental Table S3). Similar to sites Y398 and Y402 of occludin that are known to destabilize binding to ZO-1 (15), Y342 may potentially serve a regulatory role in its interaction with tight junction proteins. We have also identified a number of phosphotyrosine sites for ZO-1 (Y576, Y1027, Y1033, Y1054, Y1127, Y1152, Y1165, Y1178, Y1311, Y1333, Y1341) and ZO-2 (Y234, Y238, Y486, Y891, Y1093). Interestingly, sites Y1027 and Y1033 of ZO-1 have not been previously reported and are highly conserved. Site Y1027 is predicted to be modulated by insulin and/or insulin-like growth factor 1 receptor while Y1033 can be potentially modulated by ephrin receptor A7 and/or A3.

We have also identified a number of other phosphotyrosine proteins involved in the tight junction signaling network, including afadin (Y203, Y993, Y1237, Y1292, Y1502), vinculin (Y822), cingulin (Y238), MAGI-1 (Y373), PALS1 (also known as MPP5) (Y243, Y249), Par-3 (Y719), and Par-6 beta (Y101) (Fig. 9). Together with aPKC, Par-3 and Par-6 constitute the Par complex. PALS1, Crumbs, and PATJ constitute the Crumbs complex. Both of these complexes along with the Scribble complex are the three highly conserved core polarity complexes known to play a critical role in mediating apico-basal polarity (47). In addition, the proteins of the Crumbs and Par complex are colocalized in the apical domain at epithelial tight junctions and are thought to be involved in the assembly of tight junction (46). Members of the MAGI family of proteins have also been found at the tight junction (46). In particular, MAGI-3 has recently been shown to be localized with ZO-1 and cingulin in epithelial cells and serves as a scaffolding protein by recruiting phosphatases PTEN and RPTPβ to their substrates (1, 67). The novel site Y356 of MAGI-3 found in our study may serve a regulatory role in activating/deactivating the protein's recruitment activities. Moreover, this site is within the [D/E]xxY motif, which is among several motifs (e.g., YxxP and Y[A/G/S/T/E/D]) recognized by the kinase c-Src (56). However, NetworKIN analysis, which combines motif with contextual information, predicts the kinase to be insulin and/or insulin-like growth factor 1 receptor. This may suggest the importance of subcellular localization, in this case to the tight junction, in controlling the specificity of signaling and information flow.

Several components present in the tight junction signaling network are actually part of the adherens junction. Thus, it was expected to see enrichment in both “adherens junction” and “tight junction” in the KEGG pathway enrichment analysis (Fig. 8). In particular, α-catenin is an adherens junction protein that, aside from its role in mediating the association of E-cadherin/β-catenin to actin, also serves as an adaptor in mediating the interaction of ZO-1 and ZO-2 to actin (46). We have identified Y177 of α-catenin to be phosphorylated, and this agrees with the phosphorylation previously reported for other species according to PhosphoSitePlus. The functional significance of this site remains to be determined, although a recent study suggests the tyrosine phosphorylation of α-catenin to be involved in its translocation to the plasma membrane and subsequent association with β-catenin (5). Hence, one speculation is the involvement of α-catenin Y177 in the positive regulation of adherens and tight junction assembly and the establishment of apico-basal polarity. Another binding partner of α-catenin is the F-actin-binding protein afadin, which is known to associate with a number of other cell adhesion molecules, including nectins and Lim domain only 7 (LMO7) (14). We found two phosphorylation sites, Y1358 and Y1588, on LMO7 that have not been archived on PhosphoSitePlus. Of these two sites, Y1588 is highly conserved and is a potential target for phosphorylation by Tec and/or MAP2K3/MAP2K4. LMO7 associates not only with afadin involved in the nectin/afadin complex, but also with α-actinin, which is one of several actin-associated proteins that interact with α-catenin in the E-cadherin/catenin complex (14). Therefore, the novel sites identified on LMO7 may serve regulatory roles in modulating the interaction and signaling between the nectin/afadin and E-cadherin/catenin adhesion complexes at adherens junctions.

Remarkably, all core planar cell polarity proteins have been found to colocalize with junctional apical complexes (66). Recent studies also support the existence of interactions between components involved in apico-basal polarity and planar cell polarity (13). Thus, this suggests potential cross talk among the signaling networks for these two types of cell polarity, and the discussion above on apico-basal polarity may be also relevant for the regulation of planar cell polarity.

Regulation of cytoskeletal organization.

Much of the regulatory mechanism in initiating and establishing cell polarity involves the reorganization of cytoskeleton. In particular, cell adhesion induces localized assembly of cytoskeletal networks that promote the recruitment of signaling proteins and trafficking of domain-specific proteins [e.g., aquaporin (AQP) 2 and AQP-4] to the apical or basolateral membrane (36, 65, 68). Many proteins found in our profiling are highly enriched in the actin cytoskeleton signaling pathway (Figs. 8 and 10). One in particular, focal adhesion kinase (FAK), is a cytoplasmic protein tyrosine kinase and plays a major role in the regulation of the actin cytoskeleton at both cell-cell junctions and cell-ECM adhesions and affects cellular processes such as proliferation, motility, cell polarization, and adhesion (24, 55). The site Y397 of FAK, also identified in our profiling, is autophosphorylated in response to integrin clustering, and its activation facilitates the association of c-Src and phosphatidylinositol 3-kinase via their SH2 domains (6). It is of note that autophosphorylation and interactions via the SH2 domain are common in phosphotyrosine-mediated signaling networks. Table 4 lists the tyrosine kinases found in our study. All phosphotyrosine sites identified are potential autophosphorylation sites that may play important signaling roles in the IMCD. Table 5 lists the IMCD phosphotyrosine proteins with SH2 domains, which recognizes phosphorylated tyrosine within certain motifs (e.g., Y*xxP) (27, 59), and SH3 domains, which recognizes certain proline-rich ligands.

View this table:
Table 4.

Gene Ontology Molecular Function: “protein tyrosine kinase activity”

View this table:
Table 5.

Tyrosine phosphorylated proteins that contain SH2 or SH3 domains

Recently, integrin αvβ3 has been found to colocalize with nectin at adherens junction and is involved in the reorganization of cytoskeleton and subsequent adherens and tight junction formation and maintenance via PKC/FAK/c-Src activation (42, 45). Among the effectors of c-Src are Rho guanine-nucleotide exchange factors (GEFs) that regulate the Rho family of GTPases (Cdc42 and Rac) that are critical in regulating actin dynamics (53). We have identified novel phosphorylation sites on a number of Rho GEFs, including Y1128 of Arhgef11, Y636 of Arhgef5, and Y605 of Rgnef. GTPase activating protein (GAP) and guanine nucleotide dissociation inhibitors (GDI) are also upstream regulators of Rho GTPases. We found a novel phosphorylation site Y201 on Arhgap27 that can be potentially regulated by insulin and/or ephrin receptors. Because many of the novel phosphorylation sites discussed are likely substrates of the insulin and ephrin receptors, this may suggest a critical role for these families of kinases in modulating cytoskeleton organization and cell junction interactions. The insulin, insulin-like growth factor, and ephrin receptors have been previously identified in the IMCD transcriptome (64). Our database provides a foundation for further study of the role of phosphotyrosine-mediated signaling in the renal inner medullary collecting duct and other epithelia.


No conflicts of interest, financial or otherwise, are declared by the author(s).


Author contributions: B.Z., M.A.K., C.-L.C., and T.P. conception and design of research; B.Z., C.-L.C., and T.P. performed experiments; B.Z., M.A.K., and T.P. analyzed data; B.Z., M.A.K., and T.P. interpreted results of experiments; B.Z. and T.P. prepared figures; B.Z. and M.A.K. drafted manuscript; B.Z., M.A.K., C.-L.C., and T.P. edited and revised manuscript; B.Z., M.A.K., C.-L.C., and T.P. approved final version of manuscript.


The authors thank Jennifer Huling from National Heart, Lung, and Blood Institute (NHLBI) for help in setting up the Rat Inner Medullary Collecting Duct Phosphotyrosine Database and Aleksandra Nita-Lazar from National Institute of Allergy and Infectious Diseases for information on the antiphosphotyrosine peptide immunoprecipitation protocol. This study was supported by the Intramural Budget of the NHLBI (NHLBI Project Z01-HL-001285). Mass spectrometry was done in the NHLBI Proteomics Core Facility (Marjan Gucek, Director). Boyang Zhao was a student intern from the University of Michigan in the National Institutes of Health Biomedical Engineering Summer Internship Program, funded by the National Institute of Biomedical Imaging and Bioengineering (present e-mail: bozhao{at}mit.edu).


  • 1 This article is the topic of an Editorial Focus by Carolyn M. Ecelbarger (14a).


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 14a.
  16. 15.
  17. 16.
  18. 17.
  19. 18.
  20. 19.
  21. 20.
  22. 21.
  23. 22.
  24. 23.
  25. 24.
  26. 25.
  27. 26.
  28. 27.
  29. 28.
  30. 29.
  31. 30.
  32. 31.
  33. 32.
  34. 33.
  35. 34.
  36. 35.
  37. 36.
  38. 37.
  39. 38.
  40. 39.
  41. 40.
  42. 41.
  43. 42.
  44. 43.
  45. 44.
  46. 45.
  47. 46.
  48. 47.
  49. 48.
  50. 49.
  51. 50.
  52. 51.
  53. 52.
  54. 53.
  55. 54.
  56. 55.
  57. 56.
  58. 57.
  59. 58.
  60. 59.
  61. 60.
  62. 61.
  63. 62.
  64. 63.
  65. 64.
  66. 65.
  67. 66.
  68. 67.
  69. 68.
  70. 69.
View Abstract