Ohler et al. Supporting Dataset S2: Detailed information on UNCOVER predictions verified by RT-PCR or EST evidence. This file refers to the experiments in figure 2a. Each prediction is referenced by a human Ensembl ID set containing gene, transcript, and exon ID; this refers to the exon on the 5' end of the analyzed intron. Sequences of the newly predicted exons are given, along with its coordinates within the intron. Because some of the verifications used primers internal to the skipped exons, we have sometimes not inferred both 5' and 3' end experimentally. Verified UNCOVER predictions with no prior EST evidence: ------------------------------------------------------- Fig 2A lane i (brain) eid-ENSE00000881911.1:gid-ENSG00000004866.5:tid-ENST00000265438.3 TCGCCAGTACCTACTGCAACATCTTTTCTCCCTACACAGCGACTCCAGCTTGGGAGGGCAGGGCCAGGGTTGTCACAGCTTCCCCTGTGGTGTCTGCCTGCCAAGCACAGCTCTGGAGTTAGCCCTGGGTGTGAG pos 740-875, 5' end exact, 3' end exact suppression of tumorigenicity 7 Primers: CCAAAGTCAGCAACAATATGC forward, TGAGGATTGAATTCCACAGC reverse Fig 2A lane ii (brain) eid-ENSE00000862512.1:gid-ENSG00000126217.3:tid-ENST00000261963.3 NTTCTCCTACCACGTNCTNACTCACATNGCCAGNACCGTGATGGAACGTACNAGGAGNNTNNCCACGTNCCATTTGGATGCTGCATTA this intron contains two UNCOVER predictions corresponding to VEGA; however, the observed skipping event contains ~90bp non-matching insert Guanine nucleotide exchange factor DBS Primers: AAAGGAAAACAGACCCCCTAA forward, AATCTGCTCCTCTGCACTTG reverse Fig 2A lane iii (brain) eid-ENSE00001201432.1:gid-ENSG00000168781.5:tid-ENST00000335092.3 AGCTGCTGGATGACCAGCACCCTGTGGTCCGGTTGCTGCGCAGTTTTTCCTCTGACTGTACAGGGGGCCGGCCAGTCTCCTTGGATGCCACGCTGGCGCATCACCTGCACCAGTGCTCCTACCACCTGCGCCTCTTCCGGAACTGGCTGCGCTCAGGCCAGGATGACCCCGAGTGCCTCTACG pos 3508-3690, 3' end exact, 5' end unknown no description Primers: GCCGTAAAACGATTTTCTGTG forward, CGCTGGCAGACTCTACTCAA reverse Fig 2A lane iv (brain) eid-ENSE00001146476.1:gid-ENSG00000168781.5:tid-ENST00000335092.3 GGGGTATACTGGTGGGGGCATCCGTNTTANGACTNCATGTNTACCGTNNAGGATGATCTACCGGNGNNGTGCCATGATTAATGGCCCAATGCCGTGGAACACGT ~90bp non-matching sequence insert; coupled with alternative 3' splicing of exon (shorter) at the same time; sequence missing in A3E: GTGTATACAGTGGGGCCAGATTATGCCCATGCTGAAG no description Primers: GAAAGACGGGGTCGTACATC forward, CTTTCCCCTCACTGTCTCGT reverse Fig 2A lane v (HeLa) eid-ENSE00001084095.4:gid-ENSG00000164402.2:tid-ENST00000331370.1 GCTGGATCTGGGTGTGTCTGGGAGGCTAGGCTGTGACAGTGAAGGAAGCTGTGTCCTCTTGCCAGCAT pos 828-895, 3'end exact, 5' end predicted Septin-like protein KIAA0202 Primers: CTAGGCTGTGACAGTGAAGGA forward, CAGAGGATGTTGAAGCTGAAG reverse --------------------------------------------------------------- UNCOVER predictions with prior EST evidence but no annotation: eid-ENSE00000663745.1:gid-ENSG00000103148.2:tid-ENST00000262313.2/6978-7010 CGAGCCATGTCCTCCAACCGATGAGCAATTGGTTGCAGG pos 6977-7015; skipped exon, exact prediction example EST ID: gi|15762590|gb|BI771012.1 EST contains an extra 140 nt skipped exon matching a repetitive element CGTHBA protein containing the major human alpha-globin regulatory element eid-ENSE00001259630.4:gid-ENSG00000007392.5:tid-ENST00000293872.4/802-912 GTTCAGTGTATCTTTGCCTGCCTACATCAATCTGCAAGGGAGTTGCAGAAAGCCTCATGTTCATCGAGCC pos 877-946 (3' end of exon wrong b/c of 1 nt deletion in human; 5' also wrong) example EST ID: gi|18778192|gb|BM545792.1 LUC7-like isoform a; sarcoplasmic reticulum protein LUC7B1 eid-ENSE00001199798.1:gid-ENSG00000170558.1:tid-ENST00000269141.1/19210-19287 ATGGAGCACAGACTGTAGAAATGAAAAGCACAGAAGAAAGAGCTGTGTGAAAGAAATATTTAATCATAACATGAAAAACTCATAGTGGCAACTTTTCCATTATGTTAAATTTTCCTCATTTCTATGTGTTCTGAGTATGTGCTTAAAAAGCGAGGTTGTAATTTTCTATCTTTGTGCCCATCTCCCATCGGTTTTTGATTTTGTCTTTTTGAG pos 19209-19419; ESTs suggest long alternative 3' end rather than skipping example ID: gi|13732428|gb|BG210741.1|BG210741 Neural-cadherin precursor ------------------------------------------------------------------------ UNCOVER predictions of alternative terminal exons -- only the 5' end has been experimentally determined, the exon continues on the 3' end Fig 2A lane vi eid-ENSE00001379673.1:gid-ENSG00000159140.5:tid-ENST00000321758.3 GCGTGTTCCCTGGAAAAGAGGGACGGATGAACCTGGAAG TAAGTAAAAGACATTCTAGGTGTGTAGCATCAAGGCAGTTAATATCCAAGCATCAGCTTT CTCTTTATACATCTACACTGCATGGCCTGCACCAAATAAGGAACTGAACCAGGGGTATGT TTTTACCTCCACAGCTGCCTCCTTCCATCAGAGCACCTTGATGAACTTAATGTCTAGTCA CACGTCATTGGCATGTTTTCTCCCCAGCATTTAATT SON protein (SON3) Primers: CTCTATTCCTGGCCAGTTCA forward, ATCAAGGTGCTCTGATGGAA reverse Fig 2A lane vii eid-ENSE00001046164.1:gid-ENSG00000067369.1:tid-ENST00000263801.1 TGCTGGGAGGTAACAATCATGCCGTCT GGAGTCTCCTGCTACCTATGCCTGATGCTGGCAGATGTGGGTTGGCTGAGGGCTCATGCA GCATCCTTTGTCCAGATGCAGAAGTGGAACTGAATCCTGTTGC Tumor suppressor p53-binding protein 1 (TP53BP1) Primers: CCCATTTCATTTCACTTTGC forward, GCAACAGGATTCAGTTCCAC reverse ------------------------------------------------------------------ Sequence conservation of newly verified skipped exons in other vertebrtates: lane ENSE rat dog chicken i 881911 Y Y Y ii 862512 N N N (*) iii 1201432 Y Y Y iv 1146476 N N N (*) v 1084095 Y N N vi 1379673 Y Y N vii 1046164 Y Y N (*) amplified human sequence does not match the current genome assembly ------------------------------------------------------------------- Random controls: Fig 2A lane i eid-ENSE00001321652.4:gid-ENSG00000161980.2:tid-ENST00000293860.2 Primers: AGACCATGCTGCTGTTCTG forward, ACCAAGCACATCATCCACTT reverse Fig 2A lane ii eid-ENSE00000868377.2:gid-ENSG00000102125.4:tid-ENST00000218246.3 Primers: GCTTCACCAAGGAGCTACAC forward, GGTTGAGCTTCTCCAAAATG reverse Fig 2A lane iii eid-ENSE00001239587.1:gid-ENSG00000100220.2:tid-ENST00000216038.2 Primers: CCCCAGAGGGTCAAGACTAT forward, CACTTTGGCAATGTTGTGAG reverse Fig 2A lane iv eid-ENSE00001307891.1:gid-ENSG00000185721.1:tid-ENST00000331457.1 Primers: GCCTCACTGTGTACCCATCT forward, TGTGGTCCTGGAGTAAGGAA reverse