2.2). This could in part be attributed to the inaccessibility of mass spectrometry due to the expensive large equipment that is required. But, in general, Y2H and AP/MS techniques are complementary in the kind of interactors they detect. If a set of proteins form a stable complex, then an AP/MS screen can determine all the proteins within the complex, but may not necessarily confirm every interacting pair (the binary interactions) within the complex. On the other hand, a Y2H screen can detect whether any given two proteins directly interact. While stable interactions between co-complexed proteins can be accurately determined using AP/MS techniques, Y2H techniques are useful for identifying transient interactions between the proteins. However, due to considerable functional cross-talk within cells, Y2H can also report an interaction even when the proteins are not directly connected. In addition, some types of interactions can be missed in Y2H due to inherent limitations in the technique—e.g., interactions involving membrane proteins, or proteins requiring posttranslational modifications to interact—but these limitations may also occur with AP/MS-based approaches [Brückner et al. 2009]. Therefore, only a combination of different approaches that necessarily also includes computational methods (to filter out the incorrectly detected interactions) will eventually lead to a fairly complete characterization of all physiologically relevant interactions in a given cell or organism.
Protein-Fragment Complementation Assay (PCA)
PCA is a relatively new technique which can detect in vivo protein interactions as well as their modulation or spatial and temporal changes [Michnick 2003, Morell et al. 2009, Tarassov et al. 2008]. Similar to Y2H, PCA is based on the principle of splitting a reporter protein into two fragments, each of which cannot function alone [Michnick 2003]. However, unlike Y2H, PCA is based on the formation of a biomolecular complex between the bait and prey, where both are fused to the split domains of the reporter. Importantly, the formation of this complex occurs in competition with alternative endogenous interaction partners present within the cell. The interaction brings the two split fragments in proximity enabling their non-covalent reassembly, folding, and recovery of protein reporter function [Morell et al. 2009]. Typically, the reporter proteins are fluorescent proteins, and the formation of biomolecular complexes is visualized using biomolecular fluorescence complementation (BIFC). BIFC can also be used to map the interaction surfaces of these complexes. This enables investigation of competitive binding between mutually exclusive interaction partners as well as comparison of their intracellular distributions [Grinberg et al. 2004].
PCA can be used as a screening tool to identify potential interaction partners of a specific protein [Remy and Michnick 2004, Remy et al. 2007], or to validate the interactions detected from other techniques such as Y2H [Vo et al. 2016]. In one of the first applications of PCA on a genome-wide in vivo scale, Tarassov et al. [2008] identified 2,770 interactions among 1,124 proteins from S. cerevisiae. Vo et al. [2016] used PCA as an orthogonal assay to reconfirm the interactions detected in S. pombe (from the FissionNet network consisting of 2,278 interactions; discussed earlier). PCA has also been employed to validate interactions between membrane proteins or membrane-associated proteins [Babu et al. 2012, Shoemaker and Panchenko 2007] (discussed next).
Techniques for Inferring Membrane-Protein Interactions
Membrane proteins are attached to or associated with membranes of cells or their organelles, and constitute approximately 30% of the proteomes of organisms [Carpenter et al. 2008, Von Heijne 2007, Byrne and Iwata 2002]. Being non-polar (hydrophobic), membrane proteins are difficult to crystallize using traditional X-ray crystallography compared to soluble proteins, and are the least studied among all proteins using high-throughput proteomics techniques [Carpenter et al. 2008].
Membrane proteins are involved in the transportation of ions, metabolites, and larger molecules such as proteins, RNA, and lipids across membranes, in sending and receiving chemical signals and propagating electrical impulses across membranes, in anchoring enzymes and other proteins to membranes, in controlling membrane lipid composition, and in organizing and maintaining the shape of organelles and the cell itself [Lodish et al. 2000]. In humans, the G-protein-coupled-receptors (GPCRs), which are membrane proteins involved in signal transduction across membranes, alone account for 15% of all membrane proteins; and 30% of all drug targets are GPCRs [Von Heijne 2007]. Due to the key roles of membrane proteins, identifying interactions involving these proteins has important applications especially in drug development.
Membrane protein complexes are notoriously difficult to study using traditional high-throughput techniques [Lalonde et al. 2008]. Intact membrane-protein complexes are difficult to pull down using conventional AP/MS systems. This is due in part to the hydrophobic nature of membrane proteins as well as the ready dissociation of subunit interactions, either between trans-membrane subunits or between trans-membrane and cytoplasmic subunits [Barrera et al. 2008]. Further, membrane protein structure is difficult to study by commonly used high-resolution methods including X-ray crystallography and NMR spectroscopy.
A major avenue by which one can understand membrane proteins and their complexes is by mapping the membrane-protein “subinteractome”—the subset of interactions involving membrane proteins. Conventional Y2H system is confined to the nucleus of the cell thereby excluding the study of membrane proteins. New biochemical techniques have been developed to facilitate the characterization of interactions among membrane proteins. Among these is the split-ubiquitin membrane yeast two-hybrid (MYTH) system [Miller et al. 2005, Kittanakom et al. 2009, Stagljar et al. 1998, Petschnigg et al. 2012]. This system is based on ubiquitin, an evolutionarily conserved 76-amino acid protein that serves as a tag for proteins targeted for degradation by the 26S proteasome. The presence of ubiquitin is recognized by ubiquitin-specific proteases (UBPs) located in the nucleus and cytoplasm of all eukaryotic cells. Ubiquitin can be split and expressed as two halves: the amino-terminal (N) and the carboxyl terminal (C). These two halves have a high affinity for each other in the cell and can reconstitute to form pseudo-ubiquitin that is recognizable by UBPs.
In MYTH, the bait proteins are fused to the C-terminal of a split-ubiquitin, and the prey proteins are fused to the N-terminal. The two halves reconstitute into a pseudo-ubiquitin protein if there is affinity between the bait and prey proteins. This pseudo-ubiquitin is recognized by UBPs, which cleaves after the C-terminus of ubiquitin to release the transcription factor, which then enters the nucleus to activate reporter genes.
Two of the earliest studies using the MYTH screens reported a fair number of interactions among membrane proteins from yeast: 343 interactions among 179 proteins by Lalonde et al. [2010], and 808 interactions among 536 proteins by Miller et al. [2005]. PCA has also been adopted to identify and/or verify membrane-protein interactions. For example, Babu et al. [2012] used PCA to validate and integrate 1,726 yeast membrane-protein interactions obtained from multiple studies, and these encompassed 501 putative membrane protein complexes.
The mammalian version of membrane yeast two-hybrid, MaMTH, is also based on the split-ubiquitin assay and is derived from the MYTH assay. Stagljar and colleagues [Petschnigg et al. 2014, Yao et al. 2017] used MaMTH to probe interactions involving the epidermal growth factor receptor/receptor tyrosine-protein kinase (RTK) ErbB-1 (EGFR/ERBB1), Erb-B2 receptor tyrosine kinase 2 (ERBB2), and other RTKs that localize to the plasma membrane in human cells. When applied to human lung cancer cells, the assay identified 124 interactors for wild-type and mutant EGFR [Petschnigg et al. 2014].
2.2 Data Sources for PPIs
Several public and proprietary databases now catalog protein interactions from both lower-order model and higher-order organisms (summarized in Table 2.2). These databases contain PPI data in an acceptable format required for data deposition, such as IMEx (http://www.imexconsortium.org/submit-your-data) [Orchard et al. 2012]. The Biomolecular Interaction Network Database (BIND) [Bader et al. 2003], now called Biomolecular Object Network Database (BOND), includes experimentally determined protein-protein, protein-small molecule, and protein-nucleic acid interactions. BioGrid [Stark et al. 2011] catalogs physical and genetic interactions inferred from multiple high-throughput experiments. The Database of Interacting Proteins (DIP)