Buscar

A 11 Exercicio com resp

Prévia do material em texto

É DADO O SEGUINTE ALINHAMENTO:
CLUSTAL 2.1 multiple sequence alignment
1 -----------ATT-TCCTTGCCTTAAA----TCAATCAA----CCGTACTCACA--CTG 38
5 ---------GGACTATCCTCTCCTCAAAGATCCCCACC------CCGTATTCAAT--TCC 43
4 CCCACTTTGGAATTTTCTTTACCTTA------CCTAAGAA----GGGTATTAATT--TGA 48
2 CTCTCCTCGTGGTCGCCAGCACTGCCGTG---CCGAGCCGGAGGCCGAGCTCGGCAGCTG 57
3 ----------GGCCCTCA-CTCTCCC------TCGGTCC----------CTCGTC----- 28
 * * * * 
1 TTCTTCTTCTT-----------CTAACAACTTCTGC--------AATTTGTTCTCA---- 75
5 CTCTTCTCCTT-----------CTCTGAATTGCT------------TTCCATCTC----- 75
4 TTCTTGTGGGAAGGAAGAAGGATCAAGAATGGCG----------ATTTCGATTTCT---- 94
2 CCCGCCTCGGTGTGTCGC--CGCCGCACCCCACCCACGTAGCAAAAGCCGTCTTCCCGTA 115
3 TCCACCTTCGC---TCGCG-CGCCG-GCGCCGCTCAC--------ACCCGATTTCC---A 72
 * * * ** 
1 -TTTATG--TTCAG-----ACAGCTTCTCAGCAGC-----------TAATCTCT-CAAAA 115
5 -TTTGTG--TTTTC-----TCCATTTTTT----GT-----------T--TCTTT-CTGGA 109
4 -GCTATGAGTTTTGGAACCTCAGTTTCTTCA---------------TATTCTTG-TTTTA 137
2 CCGTAGGCAGCGGCA---AACAGCGCCCCCTCATCACTCGGTCTCGTTCCCTCCTCTCCA 172
3 CCTCACAAAACC-CA---CACGGAACCCCGGCACC--------------CCTCC------ 108
 * ** 
1 ATAATAAGAAAATGGCAGCAG------CAGGTCTCTCTTCTTCTTCATGTACT-GTATTC 168
5 GAGAT---AGAATGGC--CA-------TGAGTATCTCACCG-------GCAGC-ACATCC 149
4 GAGCT---AGGA--GTTTTGA------GAAGT----CATCA---------GTT-TTATGC 172
2 ACTTCACTCGCGTCGCGCCCCTCCACTGCACCAG--CGTCATG-----GCGGTGGCCTCG 225
3 ACTCCACCAG----GCGCC----------AGCGT--CGTCATG-----GCGGTAGCCTCG 147
 * * * * 
1 ACACTCTCCTCATCCTTTAAAACCAGAA-GACA-TCTCACTAAAACCCCACAAAACCCTT 226
5 A-ACCTCAATCTCCATTAGAAATTGTAA-AACCCTTTCCCCTTTCCCCCGCCAAA----A 203
4 A----------ATTCCCAGAACCCAT---GTCGGTTTAATTCTGTTTTTCCGA------- 212
2 ACCTCGCCGCTGTCCGCCAAGCCCGCCACGGCCCCTTCGCCGCCCGCTCCCGGA------ 279
3 ACGTCGCCCCTTTCCGCCAAGCCCACCACGGCCACCTCGCCGCCCACTCCCGCA------ 201
 * * * * * 
1 TGCTTTTCAAACGCAACAACAGCAGCAACCGCACTTCACATCCCTTTCTT----CCTTCC 282
5 TG-TTTTGAGATACAATCCC---------CTCTCTCCTCGTCGATCTAAA----GTCTCT 249
4 ---TTCGGAAATCTGATGGG------------GCTTCACGGTGTTCT-------GTTTCT 250
2 ---TCCGGGCTCCTCGCTCTCGG------CGTTCGCCGCGCCCCCGCCACTGCCGCGTGG 330
3 ---TTCGGCTTCCTCGCTCCCCG------CTCCCGCCGCGGCCGCGCCAC---CGCGTGG 249
 * * * * * 
1 AGAAAGTT--GTGTAGAGTTCAAGCAACAATTTTGAGAGAAGATGAAGAGAAGAAAGTTG 340
5 GGGAAGCT--GTTCCGAGTTCAAGCAACAGTTTTGCGAGAAAATGATGAG---AAAGTGG 304
4 AGGAAATC--ATGTAGAGTTCGAGCAACGTTGTTACAAGAGAATGAAGAA---GAAGTGG 305
2 AGGAGGCTCCGCGTGGAGG-CGATCAG-GACGCAGCGA------------ACGGAGGTGC 376
3 AGGAGTCTCCTCGTGGAGG-CGATCCG-GACGCAGCGGGAGAAACAGGGGACGGAGGGGC 307
 * * *** * * * * * 
1 TTGTTGAGGAATCATTTCAACCAAAG---------------ACTTTCACTCATGAGCC-- 383
5 TGGTAGAGGAAACATTTCAACCCAAA---------------ACTTCTACAAATGAGGAGA 349
4 TTGTGGAGAAATCTTTTGCACCTAAG---------------AGTTTTCCTGATAACGTGG 350
2 CCGTCGAGGAGTCCGCCCCCGCCAGGGACGCCGCCGCTGCCGCGCCCCTGGACGGAAACG 436
3 ACTTCGAGGAGTCCGCC---------------------GCCGCGCCCCTGGACGGAGTCG 346
 * *** * * * 
1 -------------TGTTCGGGGCTCACCACAATCTTCATCGCCTGGTGGATTAGAGACTT 430
5 GAAAAGGAGGAGATGGT-GAGCCCCAGGATGA--TTCCTCATCAGGTACTCTGGAGAAAT 406
4 GAGGGGGAAGTAATGGG-AAGCCACCAGATGA--TTCATCCTCTAACGGTCTAGAGAAAT 407
2 GAGCCGGAGCGGACG-----GCTCCGTGGT----TCCTTCCTCGGACG------ACAGCT 481
3 GAGTCGGAGCCGATG-----ACCCCGTGGT----TCCCTCCTCGGACGC---GAGCGACT 394
 * * * * ** * *
1 GGGCTATCAAGCTTGAGCAATCTGTCAATGTCTTTCTCACTGATTCGGTGATAAAAATTC 490
5 GGGTTATCAAGCTTGAACAATCTATCAATATCTTTCTCACGGATTCAGTGATAAAGGTTC 466
4 GGGTTATAAAGCTTGAGCAGTCTGTAAATATCTTACTCACGGATTCAGTGATAAAGATTC 467
2 GGGTTGTCAAGCTCGAGCAGTCGTTCAACATTTTCGCCACGGATTCGGTGATTATGGTAC 541
3 GGGTGGTCAAGCTCGAGCAGTCCTTCAACATTTTCGCCACGGATTCGGTGATAATGGTAC 454
 *** * ***** ** ** ** * ** * ** *** ***** ***** * * *
1 TTGATACTCTTTATCACGACCGCGATTATGCTAGGTTCTTTGTTTTGGAAACTATTGCTA 550
5 TAGATTCTTTGTACCATGACCGAGATTATGCAAGGTTCTTCGTGTTGGAAACTATTGCAA 526
4 TTGACACTTTGTATCACAACCGAAACTATGCGAGGTTTTTTGTTCTGGAAACAATTGCAA 527
2 TCAAGGGCGTGTACGGTGATCGGTACTACGCCAGGTTCTTTGCGCTGGAGACGATTGCGA 601
3 TCAAGGGCGTGTACCGTGATCGGTACTACGCCAGGTTCTTTGCGCTGGAGACGATTGCCA 514
 * * * ** * ** * ** ** ***** ** * **** ** ***** *
1 GAGTTCCTTATTTTGCCTTTATATCTGTTCTTCACATGTATGAGAGTTTTGGTTGGTGGA 610
5 GAGTTCCTTATTTTGCCTTTATATCTGTCCTTCATATGTATGAGAGTTTTGGTTGGTGGA 586
4 GGGTTCCTTATTTTGCATTTATATCGGTTCTTCACATGTATGAGAGCTTTGGCTGGTGGA 587
2 GGGTGCCGTACTTCGCATTCATATCGGTGCTTCACTTGTATGCGACCTTTGGATGGTGGA 661
3 GGGTGCCGTATTTCGCGTTCATATCGGTGCTTCACATGTATTCGACCTTTGGCTGGTGGA 574
 * ** ** ** ** ** ** ***** ** ***** ***** ** ***** *******
1 GAAGAGCTGATTATCTCAAAGTGCATTTTGCTGAGAGCTGGAATGAGATGCATCACTTGC 670
5 GAAGAGCGGATTATCTCAAAGTACATTTTGCTGAGAGCTGGAATGAGATGCACCATTTGC 646
4 GAAGGGCAGATTATATGAAAGTGCATTTTGCTGAAAGCTGGAATGAGATGCACCATTTGC 647
2 GACGAGCTGATTACATAAAGGTTCACTTTGCGCAGAGCTGGAACGAGTTCCATCACCTCT 721
3 GACGAGCGGATTATATAAAGGTTCACTTTGCGCAAAGCTGGAACGAGTTCCATCACCTCT 634
 ** * ** ***** * ** ** ** ***** * ******** *** * ** ** * 
1 TTATCATGGAAGAATTGGGTGGTAATTCTTGGTGGTTTGACCGGCTTCTTGCTCAAGTTA 730
5 TTATCATGGAAGAATTGGGGGGCAATGCTTGGTGGTTTGATCGGTTTCTTTCCCAACATA 706
4 TCATTATGGAAGAATTAGGGGGAAATGCTTGGTGGTTTGATCGATTTCTTGCACAACATA 707
2 TGATCATGGAAGAATTGGGTGGCGACTCTTTGTGGTTTGACTGTTTTCTTGCTCGGTTTA 781
3 TGATCATGGAAGAATTGGGTGGCAACTCTTTATGGATTGACTGTTTCCTTGCTCGGTTTA 694
 * ** *********** ** ** * *** *** **** * * *** * * **
1 TAGCAACCTCTTATTATTTCATGACAGTCTTAATGTATGCATTGAGCCCAAGAATGGCAT 790
5 TAGCCGTCGTGTACTATTTTATGACAGTTCTCATGTATGCAATAAGCCCAAGAATGGCTT 766
4 TAGCTATATTCTATTATTTCATGACAGTCTTGATGTATGCTTTGAGCCCGAGAATGGCAT 767
2 TGGCATTCTTTTACTACTTCATGACTGTTGCAATGTACATGCTGAGCCCACGAATGGCAT 841
3 TGGCGTTTTTTTACTACTTCGTGACTGTTGCGATGTACATGCTGAGCCCAAGAATGGCAT 754
 * ** ** ** ** **** ** ***** * ***** ******* *
1 ATCACTTCTCTGAATGTGTTGAGAGCCATGCATTTGCAACTTATGACAAATTTATCAAGG 850
5 ATCACTTTTCTGAATGCGTGGAAAGCCATGCATTTTCAACTTATGACAAATTTATCAAAG 826
4 ATCATTTCTCTGAATGTGTGGAGAGCCATGCATACGAGACTTACGATAAATTCATCAAGG 827
2 ATCACTTTTCCGAATGTGTGGAGAGACATGCATATTCCACCTATGATGAGTTCCTCAAGC 901
3 ATCACTTTTCTGAATGTGTGGAGAGACATGCATATTCCACCTATGACGAGTTCCTCAAGC 814
 **** ** ** ***** ** ** ** ******* ** ** ** * ** **** 
1 CCCAAGGAGATGATTTGAAAAAATTGCCTGCACCTGAGGTTGCTGTAAAATATTATACCG 910
5 CACAAGGAGAGGAGTTGAAAAAATTACCTGCTCCTGAAGTTGCTATAAAATACTACACTG886
4 ATCAAGGAGAGGAATTGAAGAATTTGCCCGCTCCAAAGATTGCAGTGGACTACTACACGG 887
2 TCCATGAAGAGGAATTGAAAAGACTACCAGCTCCAGAGGCAGCATTGAACTATTACATGA 961
3 TCCATGAAGAGGAATTGAAAAGACTACCAGCTCCAGAGGCAGCATTAGAATATTACCTGA 874
 ** * *** ** ***** * * ** ** ** * ** * * ** ** 
1 AGGGTGATTTGTACTTGTTTGATGAATTTCAAACTTCCAGAGCTCCCCATTCTCGTAGGC 970
5 GTGGAGACTTGTACTTGTTTGATGAGTTTCAAACTTCCAGAGCCCCCAATACACG----- 941
4 GAGGTGACTTATATTTATTTGATGAGTTTCAAACTTCACGAGAGCCTAATACTCGAAGAC 947
2 ATGAGGACCTTTACTTATTCGATGAGTTTCAGGCATCAAGAACTCCAGGTTCTAGGAGGC 1021
3 ATGAGGACCTTTACTTATTCGATGAGTTTCAGGCATCAAGAAGTCCAGGTTCTAGGAGGC 934
 * ** * ** ** ** ***** ***** * ** ** ** * * * 
1 CAAAAATAGAGAATTTGTATGATGTATTTCTGAACGTCAGAGATGATGAGGCTGAACATT 1030
5 ------------------------------------------------------------
4 CAAAAATAGATAATCTCTATGACGTATTCATGAACATTAGAGATGACGAAGCAGAGCATT 1007
2 CTAAAATAGATAACTTATACGATGTATTCGTTAATATACGAGAAGATGAGGCAGAGCACT 1081
3 CTAAAATAGATAACTTATATGATGTATTTGTCAATATACGAGATGACGAGGCAGAACACT 994
 
1 GTAAGACCATGAAGGCCTGCCAAACACATGGAAATCTCCGCTCTCCACATTCATATCCAG 1090
5 ------------------------------------------------------------
4 GTAAAACGATGAAAGCCTGTCAAACTCACGGGAGCCTTCGTTCTCCACACACAGATCCAT 1067
2 GCAAGACAATGAAGACCTGTCAAACACATGGAAATCTTCGTTCTCCTCATTCAA---CGC 1138
3 GCAAGACAATGAAGACCTGTCAAACACACGGAAATCTTCGTTCTCCTCACTCAA---CGC 1051
 
1 AGGATGCTTTTGAAGATGATACTGGCTGTGATCTTCCTCAAGCAGATTGTGAAGGTATTG 1150
5 ------------------------------------------------------------
4 GCGATGATTCTGAAGATGATACAGGGTGTTCCGTACCTCAAGCTGATTGTATAGGTATCG 1127
2 CGAACTGCTTAGAAGATGATACGGAATGTGTAATACCTGAAAACGACTGTGAAGGTATTG 1198
3 AGAACTGCTTAGAAGCTGATACGGAATGTGTAATACCTGAAAACGATTGTGAAGGTATTG 1111
 
1 TTGATTGTATAAAGAAATCTGTAACATCTCCTCCATCAAAGCAAAATATTTGAAGGAGAA 1210
5 ------------------------------------------------------------
4 TGGATTGTATAAAGAAGTCAGTCACCGATACTCAAGTAAC-CAAAAGGTAGGAAAAGGAA 1186
2 TGGACTGTGTCAAAAAGTCCCTTAC---------------------------AAAGTAAA 1231
3 TGGACTGTGTCAAAAAGTCCCTTAC---------------------------AAAGTAAA 1144
 
1 CAA---GGAAACAGTATACATTT-----------CAGGTGATGGA--------------- 1241
5 ------------------------------------------------------------
4 AAACGCGGACAAACTATACTTGTATATACTAGTATAGACAAAAAAAAAAAAATACAAAGA 1246
2 TAG---------ACTATATTCTTT----------AATGTGTTTGA--------------C 1258
3 TAG---------ACTATATTCTTT----------CGTGT--------------------- 1164
 
1 --TAGTGAGATATTGTAT-----------ACATAGAAA-------GAAACATTACAA--- 1278
5 ------------------------------------------------------------
4 TATAGGTACATGTTGTATCTTTTTGTCTGAGGTAGAGATGTTTTGGAAAGTTTTCAAGGC 1306
2 TGTTTTCAGGTGTCGTA------------GTATACAAAG----TATAAAATTTTGATG-- 1300
3 -GTTTCCAGGCGTCATA------------GTATACAAAG----TACAAAATTTTTATG-- 1205
 
1 -ATTTACATAATTTACAT-------------------------GCCTTCAAGCTAAAAAT 1312
5 ------------------------------------------------------------
4 AAAATACATAGTTTACCTCATGTAAGCTCTCCTTACTAAGATTGTTTTTATATTTAGAAT 1366
2 -ATATCCTCTCT------------------------------------------------ 1311
3 -ATATCCTCTTTTTTCCT---------------------ATGTACCTTTTTAAGTAAAAT 1243
 
1 ------------------------------------------------------------
5 ------------------------------------------------------------
4 TTGTACCCGAAACCTGTTATAGGTGGATGGATCTTGACAACTTTTGGATATATCGACTCC 1426
2 ------------------------------------------------------------
3 CTGGGTCATATGTAAGAGAAGCCCAGTTTAGAACTTTAGATACAATTATGCAAC------ 1297
 
1 -------------------------------------
5 -------------------------------------
4 ATGAGCTTTTTACACACGCAAAAAAAAAAAAAAAAAA 1463
2 -------------------------------------
3 -------------------------------------
 
1 ) Obter as sequências no formato fasta.
2) Identificar cada uma das sequencias: Função (se existente) e organismo
3) Em caso de corresponder a cDNA ou gene, informar a sequência de aminoácidos deduzidos
4) Obtenha um cDNA completo da especie Prunus persica usando dados de ESTs e que apresente a mais alta identidade com a sequencia 5 como isca.
5) Qual a relação filogenética entre as sequências obtidas através do Clustal?
Dada a sequência gênica anotada abaixo, responda as questões de 6 a 9:
6) Trata-se de que gene? De qual organismo?
7) Quantos exons e introns existem?
8) Obtenha a sequência do cDNA (identificando UTRs e ORF) e proteína deduzida.
9) Em relação a informações contidas no promotor desse gene: Que estímulos ambientais poderiam influenciar sua expressão? 
acatgaaaataaccagcaaatactaatttagcatgttatagtgaagagggcaaaaagttaacaagatattatccttctcgatagaagaaattgacaaagggaatatgcagaaagaaaagagggaggacccaattggatttgcctgaatttcaccacatatctatcttcgaacatgataaaagtgacgtatcttgaggaggaccacacatgcatatgcgactttctaagattctacacatcaatactttatctggtttttccatcatccaacctatagtactgtaagttgacaacgttatatgaagatgctgttgtttttcttagtattccttttcttttgcaccatctgcatttttctcaagcaatttctctgatacttgagatgtttggtggttgagaaaacctccaaagctacctgtccgtagttgaggaaataatgagggttgagaggtaacatgtgatggtgttcgaaaagtaacactgagggatgcatctatttttagttaattcagcaccgtacagtgtgagttaagttgggaaaattattgatgaaaagatgttgcggattatagtgtgctttaattagtatgagagtggtggtgtaggagtagttaaagcatagaaaattaagaaatggaaaagcataatattgtggggtaggaagagggagaagagggccaagtgtatgaaaagtaaagggagcataacacacactctcatggggcataattgcaccccatgctcatgcctaatcattttgcttcccatctcaatcatgtctccatccctaacaatcatgaaccatctttctttccttaatttcctctttttctgcaatttatcactaaaactgttctttatcttttagatagttaagtttaacaacggatgagcccattggggtctgttctccatgttggatcctattgcaccactcctttctccctccagcatcggattctcagctcttgtcgtatataaataaactcccacttcaatatattttgtcctatatagtttgcaaattactacaatctaaatcaatttgaccctactatacgtttctttttaatttctatattcaaatctcatttactaaaatagcataaattatgtatataaactaaaagaagtataataggtcttgtttttttaatcaaaattcctttactatcacaatcacaatgttagctctccagtaataagtcttgtagtgtttgaaaatgtatggacatggctcaatggagttcaattgtctcttgatcatgaatttcatattacgaagtagactttaagcctaagtcaactcctagtcgatgtgggatctttaatacacccctcacgccgagagtgacagttatgactctgataccatatcacgaagtagactttaagtctaactcaaccccataaaatcgactcatggagtgaggttcacaaccacttatatacaatgaaaatttcataatctctaattgatgtgggatctccaacatactttgtaagtgtgatgcggaacacactttataagtgtggtgcaggtaggtcgggagaccggataatcgagtagtaaaatatagttgagcctccaatagtacacaagattgtaagagttttgttttcaagaacatcatattctcacattggttaagttttgaaattgaatgaaaaataagcaaaattggataaaaatataatgtagtgtaggttgaaggaaagcaaaatatttagattaaaaaggataaaaagagtgagaaaagctatggatagaggggatggaagtgcaatgcaagtgagcccagaattttggggcaaacgaatgagcatggaggcaaagaggcaaagaggcagcaggtgcagtgcatgtcgttttgcatggcctgatgattcgtgaaggaatttgttttgtcgtttagtggaaacattatcgtttgttcccatattttgttgaggaaaagacaacacatgtctatgttggtccgtcacagaaagaattgcattgtggtgagaacgtgcagtcttattttctttcttgcatcatttatcatattttggcctataccaacgaacgtaccgttttttgtttttttgcaaaagccatgtttgcatgagacgacccagtgggtggaaagaaggcccaaatggatagtgaagaaatagcggaagtggtaaaaatgggccaagctaataaaagcaatgtgtgtgcggagccacagaaggaaaggacacgcgtgaagaacaacacgaatgtctcgttgaggcaatgtgctgagaacgtggacaccttcatttaaaaacccaaactcattgaaacttgaggtttgatttgggcgtccaatcccgtgtatcagttttcgccttcaggttaatcgctttcttcctcctatctctctaacattctctctaacattctcatgcattgctttttttctctgctttctttttccaattttcaaacaaaatgttcggattctcatataatcacctaccattagtaataatgtcttttctttcgtaactcgatgaatgtgtttaattacttactcttctctcaattgtatccatactaccaggctgtgatcaatttttatttttttgagttaaaattgttccatgttttccccttgtattgtcttgtctttcaatagtttgagaggaaatgactgctttgtgcagattttgaaagacaatggccgatgctgaagatattcagccactcgtctgcgataatgggacaggaatggttaaggtaaaacatgaaagtaggagtgattggatatgtagtatgttcttttaggtgtgggtggtgtctttggatttgtactgatttagtgagtcttttgcactgtactgcgttgcattgcaatttggataatgcaggctggatttgccggtgatgatgctccgagggctgtgtttcccagcattgttggtcggccacgtcatactggtgtgatggttgggatggggcaaaaggatgcatatgttggggatgaggctcagtccaagcgtggtatactgactctgaaatatcccattgagcacggtattgtgagcaactgggatgacatggagaagatctggcatcacaccttctacaatgaactccgtgtggccccagaggagcaccctgttctgctcactgaagcccctctcaatccaaaggctaatcgtgagaaaatgacccaaatcatgtttgagaccttcaacacacctgccatgtatgttgccatccaggctgttctgtccctctatgccagcggtcgtacaactggttagtctttattatgctgtctataaaaaaaacagccaatcctctgcaaggttcacattgcccttgaaaacatggaccagaaaattgaatgtggagaaaagaataacaaaaagggctcaagggtctgtttgattgccttttcatttttcttattttagaaactgttttatattcaacaatcaatttctaaccccagagagggttcattagttttgaaaacgatttcaaaagtaaacaaatggaaacaactttgagatgtgttttcattcttcttgtattctgtgaccctaggtcaagagtgttgttggtcattaaggatggtgctaatgtagtgaggcatgatatgagctgtgttggctttgcagagatttattcgtgatttcaagagctaaacgatttttgaaaatagaaatcaacactccctgattctcaaagttcaaacattttggttttgaccctgctaattttatatgcttaaacagttatctgatatgaagtaattgagttactctaacttggctggttgccgttatggttggtacactaatttgtgtaggcattgttctggactctggagatggtgtcagccacactgtacccatttatgaaggctatgccctccctcatgccatcctccgtcttgacttagcaggacgtgacctcactgatgccctgatgaaaatcctcactgagcgtggttactccttcactaccacagctgagagagaaattgtaagggatatgaaggaaaaattggcttatattgcccttgattatgaacaagagcttgaaacttcaaaaactagttcatctgttgagaagagctatgagctgcctgatggacaggtcatcaccattggtgccgagcgcttccgttgtcctgaagttctcttccagccatccatgattggaatggaagctgttggcattcatgagacaacatacaattcaattatgaagtgtgatgttgatatcaggaaggatttgtatggaaacatagtcctcagtggtggttctaccatgttccctggtattgcggatagaatgagcaaggaaatcactgccctggccccaagcagcatgaagatcaaggtggttgctccacccgagagaaagtatagtgtctggattggtggctctatcttggcttctctcagcactttccaacaggtaaacattctgtgattctaatgaccaaagtaatttgtgttgtgataaacctaaaacggtgatgtctcatgcactttattacaactgcatgacggagacatcaagtgtgtgattgtctttattttagttttagttctgatgctcttatctaaatctcagtttgatagcttggttgcttgaattttgtctccttctcatgacagttgaggttcataggccaattgggatttgtgtaacttactatgcatactaatagcaagcggcatctcatgcatttgggcacacatgctagaatgttttttatatgggtcagcttattaagattattaatgagaatgcaattgtttttggaaatgcagatgtggattgcgaaggcagagtacgatgagtccggtccttcaattgtacacaggaaatgcttttaaatgcagcagaacggcttggtgtcctcgattggcaatgcagtagccccttgcaatgattgaagatctttcttaaaaaatgtttgtgttgttagaacagtgtactaagataaagtggagttttatggcagaattaattctttcatttgttttgttttaacattgttaattgttgaacagttgatctttaatgatatatcaaaaccaaatcagcttattactcattgcttcctcgtaatcatcatattcttctatttaaatcaaagtcagatcctataattctgtatatgtttcaagttaattagaacctcctaatcaaactggtccttggttgggttatatggctgcagagttgagattttaaatacgcattttcactgcaagaattcggagctggttaagatataatgtaagagcatttgtgtttgattttttggttgctgaggcaacattttgaatattatactgaatgtatattactttaaatccttttgctagatgttgaaatttctgaacattttaaatgcttaaaagttggatgtcttacccagcaaacataaacacctacctcggttttagacttactgtcacgttgcctgtgcatctgaatttggttttcagttgaaaacttacatactcagattcaaaaagaaaacaaaaataaagaaacttctcattttagtatttctcatccaaaaaataggttatatatattcaaagaaattagtatgtattatttggtatataaaaatttcaatactttttatcttgtgcactaaatgcctaattaagaaaataaaaagttattgaaataaaataagctaattaaattaaaaacaaaaattaattatttgtgttttttcaatatttaagagaatgaatgcatttcccttaacagatggaacaaaatagattataaaatactattactttataaattattgcacatttaaactataagtgtaagagatacattcaacatataaaaatgagggacgttcagaactgaaactaatttttttttaccgataattgataaaaaaaaaaggtgatgatcaagtaaaatcgaaatttgttggatccatttgaatttttttttagaaccaagcagttctgtttggtttgtggtttaacattttgaaattgatccaaagagaatcgaagagaaatcaattatttaagtattattatagtacttaatatttcactatcatacacaagagaagaagagatgtgtatgcgagagaagaaaattaatatgtctaactagtttcatgtttataaatctgagtaaaagatacgtttaacatatgaataaagtaaaaagaagagtaaaaggaagagttattataatagtgcgaaagttacattattaacaagtgctaaattaaaatcaaagattattgaaaggaaaccatatataatgactaaattttttgatgtttatgttttgaattttgtgaacttatgtgtgcaaagattctgttagagttaataaaatgtacatgctagaagttatttataatttatattacaatgtataaatataatccaatcaccttttttctttttattttatagtaattcatttgttttaaatatttaattatagttaaatgagagaataaaattattcattttgatataacattagagttgttgagaataagaatgaatgacttcttgaataggtcgtgaatgaggtctttgttgttgcgtatattttttgtcttctagttatcgattttgaatagtgttcaatcttctgtcttcgattagtaaagaggagggagtatctgaagaagattatccaatgtttaagtcattactcgatatttgatgattgtagataataaatgttattatatgcatatctttacttgtttggtagtagtggtcaacttatttaaagggataatttgagcttaaaaagtgttgagccaaattaacacgagtagcgttatcaagtctttgtattaatctaacacgaagtctaatgtcactagacttgtcatggatgtagagcattaaatataaaatatgttcctactagggagatgagacctagagaactattgaagtcttcttctccttccttctttcggtcggttaggttgttgggtgatgaacttgaattcttctcgtcctcctgcttctccttcggcactcggtctcgggtgaacaccggatggtgggggtacctgcgaaggcactccgacactcaagttagtaaagcgggtgatcaattttctggtaggtgaacagtaataaatgacgtacctttctacttgggatgtgtgctatttatattattctaatgggcctaccttgttggacccgtttatcggagtggatcttgcttaggattgcattacttaattattaatgatggattagttttactgaccttggtttgagcgttaacggtagtatgggtcggctggctagaccgaccgtgatgtttccacttgtctcggtcgcctcggccgaatctatcgggtctcggccagactccctctgatctcggtctcggacaatcaagtgggtatcggttaggatgaccaagtgtcgatctcgcccatatcgatacaaaatatattcactaataaataaaaatgtatataataaaatcacaaacaaaaactatactaaaaaaatcatttcttttaattttttttttttaatcttcaaatcgaaccaaatcaaacggttaataatttttatattaggttgaaatacattttaacttaaaaaattaaagtacataccgcgaaacatcctaataaaaataattgaacagatataaaacaaagattgtgttaggaatgaatttctaaagtctgaacaaatggactgttaaggttgggattttccccgggttaggtgaataacaagaatattgttcggtctctttcaaaccattctttatttcaaaggatattagcttttaaatgaatataatattatattctattaattctattttataaaattgattaaaaataaaatttacatccaattatataatatagttaagatgttgttacatatagttgatgttaaaaatataaaaaaaaaaatcacagtcgtatacttagtatagagaaaaaatggtagttgaatgaaattaaacatRESPOSTAS
Usar conversor de sequencias para obtenção das sequencias no formato fasta
>1
ATTTCCTTGCCTTAAATCAATCAACCGTACTCACACTGTTCTTCTTCTTCTAACAACTTCTGCAATTTGTTCTCATTTATGTTCAGACAGCTTCTCAGCAGCTAATCTCTCAAAAATAATAAGAAAATGGCAGCAGCAGGTCTCTCTTCTTCTTCATGTACTGTATTCACACTCTCCTCATCCTTTAAAACCAGAAGACATCTCACTAAAACCCCACAAAACCCTTTGCTTTTCAAACGCAACAACAGCAGCAACCGCACTTCACATCCCTTTCTTCCTTCCAGAAAGTTGTGTAGAGTTCAAGCAACAATTTTGAGAGAAGATGAAGAGAAGAAAGTTGTTGTTGAGGAATCATTTCAACCAAAGACTTTCACTCATGAGCCTGTTCGGGGCTCACCACAATCTTCATCGCCTGGTGGATTAGAGACTTGGGCTATCAAGCTTGAGCAATCTGTCAATGTCTTTCTCACTGATTCGGTGATAAAAATTCTTGATACTCTTTATCACGACCGCGATTATGCTAGGTTCTTTGTTTTGGAAACTATTGCTAGAGTTCCTTATTTTGCCTTTATATCTGTTCTTCACATGTATGAGAGTTTTGGTTGGTGGAGAAGAGCTGATTATCTCAAAGTGCATTTTGCTGAGAGCTGGAATGAGATGCATCACTTGCTTATCATGGAAGAATTGGGTGGTAATTCTTGGTGGTTTGACCGGCTTCTTGCTCAAGTTATAGCAACCTCTTATTATTTCATGACAGTCTTAATGTATGCATTGAGCCCAAGAATGGCATATCACTTCTCTGAATGTGTTGAGAGCCATGCATTTGCAACTTATGACAAATTTATCAAGGCCCAAGGAGATGATTTGAAAAAATTGCCTGCACCTGAGGTTGCTGTAAAATATTATACCGAGGGTGATTTGTACTTGTTTGATGAATTTCAAACTTCCAGAGCTCCCCATTCTCGTAGGCCAAAAATAGAGAATTTGTATGATGTATTTCTGAACGTCAGAGATGATGAGGCTGAACATTGTAAGACCATGAAGGCCTGCCAAACACATGGAAATCTCCGCTCTCCACATTCATATCCAGAGGATGCTTTTGAAGATGATACTGGCTGTGATCTTCCTCAAGCAGATTGTGAAGGTATTGTTGATTGTATAAAGAAATCTGTAACATCTCCTCCATCAAAGCAAAATATTTGAAGGAGAACAAGGAAACAGTATACATTTCAGGTGATGGATAGTGAGATATTGTATACATAGAAAGAAACATTACAAATTTACATAATTTACATGCCTTCAAGCTAAAAAT
>5
GGACTATCCTCTCCTCAAAGATCCCCACCCCGTATTCAATTCCCTCTTCTCCTTCTCTGAATTGCTTTCCATCTCTTTGTGTTTTCTCCATTTTTTGTTTCTTTCTGGAGAGATAGAATGGCCATGAGTATCTCACCGGCAGCACATCCAACCTCAATCTCCATTAGAAATTGTAAAACCCTTTCCCCTTTCCCCCGCCAAAATGTTTTGAGATACAATCCCCTCTCTCCTCGTCGATCTAAAGTCTCTGGGAAGCTGTTCCGAGTTCAAGCAACAGTTTTGCGAGAAAATGATGAGAAAGTGGTGGTAGAGGAAACATTTCAACCCAAAACTTCTACAAATGAGGAGAGAAAAGGAGGAGATGGTGAGCCCCAGGATGATTCCTCATCAGGTACTCTGGAGAAATGGGTTATCAAGCTTGAACAATCTATCAATATCTTTCTCACGGATTCAGTGATAAAGGTTCTAGATTCTTTGTACCATGACCGAGATTATGCAAGGTTCTTCGTGTTGGAAACTATTGCAAGAGTTCCTTATTTTGCCTTTATATCTGTCCTTCATATGTATGAGAGTTTTGGTTGGTGGAGAAGAGCGGATTATCTCAAAGTACATTTTGCTGAGAGCTGGAATGAGATGCACCATTTGCTTATCATGGAAGAATTGGGGGGCAATGCTTGGTGGTTTGATCGGTTTCTTTCCCAACATATAGCCGTCGTGTACTATTTTATGACAGTTCTCATGTATGCAATAAGCCCAAGAATGGCTTATCACTTTTCTGAATGCGTGGAAAGCCATGCATTTTCAACTTATGACAAATTTATCAAAGCACAAGGAGAGGAGTTGAAAAAATTACCTGCTCCTGAAGTTGCTATAAAATACTACACTGGTGGAGACTTGTACTTGTTTGATGAGTTTCAAACTTCCAGAGCCCCCAATACACG
>4
CCCACTTTGGAATTTTCTTTACCTTACCTAAGAAGGGTATTAATTTGATTCTTGTGGGAAGGAAGAAGGATCAAGAATGGCGATTTCGATTTCTGCTATGAGTTTTGGAACCTCAGTTTCTTCATATTCTTGTTTTAGAGCTAGGAGTTTTGAGAAGTCATCAGTTTTATGCAATTCCCAGAACCCATGTCGGTTTAATTCTGTTTTTCCGATTCGGAAATCTGATGGGGCTTCACGGTGTTCTGTTTCTAGGAAATCATGTAGAGTTCGAGCAACGTTGTTACAAGAGAATGAAGAAGAAGTGGTTGTGGAGAAATCTTTTGCACCTAAGAGTTTTCCTGATAACGTGGGAGGGGGAAGTAATGGGAAGCCACCAGATGATTCATCCTCTAACGGTCTAGAGAAATGGGTTATAAAGCTTGAGCAGTCTGTAAATATCTTACTCACGGATTCAGTGATAAAGATTCTTGACACTTTGTATCACAACCGAAACTATGCGAGGTTTTTTGTTCTGGAAACAATTGCAAGGGTTCCTTATTTTGCATTTATATCGGTTCTTCACATGTATGAGAGCTTTGGCTGGTGGAGAAGGGCAGATTATATGAAAGTGCATTTTGCTGAAAGCTGGAATGAGATGCACCATTTGCTCATTATGGAAGAATTAGGGGGAAATGCTTGGTGGTTTGATCGATTTCTTGCACAACATATAGCTATATTCTATTATTTCATGACAGTCTTGATGTATGCTTTGAGCCCGAGAATGGCATATCATTTCTCTGAATGTGTGGAGAGCCATGCATACGAGACTTACGATAAATTCATCAAGGATCAAGGAGAGGAATTGAAGAATTTGCCCGCTCCAAAGATTGCAGTGGACTACTACACGGGAGGTGACTTATATTTATTTGATGAGTTTCAAACTTCACGAGAGCCTAATACTCGAAGACCAAAAATAGATAATCTCTATGACGTATTCATGAACATTAGAGATGACGAAGCAGAGCATTGTAAAACGATGAAAGCCTGTCAAACTCACGGGAGCCTTCGTTCTCCACACACAGATCCATGCGATGATTCTGAAGATGATACAGGGTGTTCCGTACCTCAAGCTGATTGTATAGGTATCGTGGATTGTATAAAGAAGTCAGTCACCGATACTCAAGTAACCAAAAGGTAGGAAAAGGAAAAACGCGGACAAACTATACTTGTATATACTAGTATAGACAAAAAAAAAAAAATACAAAGATATAGGTACATGTTGTATCTTTTTGTCTGAGGTAGAGATGTTTTGGAAAGTTTTCAAGGCAAAATACATAGTTTACCTCATGTAAGCTCTCCTTACTAAGATTGTTTTTATATTTAGAATTTGTACCCGAAACCTGTTATAGGTGGATGGATCTTGACAACTTTTGGATATATCGACTCCATGAGCTTTTTACACACGCAAAAAAAAAAAAAAAAAA
>2
CTCTCCTCGTGGTCGCCAGCACTGCCGTGCCGAGCCGGAGGCCGAGCTCGGCAGCTGCCCGCCTCGGTGTGTCGCCGCCGCACCCCACCCACGTAGCAAAAGCCGTCTTCCCGTACCGTAGGCAGCGGCAAACAGCGCCCCCTCATCACTCGGTCTCGTTCCCTCCTCTCCAACTTCACTCGCGTCGCGCCCCTCCACTGCACCAGCGTCATGGCGGTGGCCTCGACCTCGCCGCTGTCCGCCAAGCCCGCCACGGCCCCTTCGCCGCCCGCTCCCGGATCCGGGCTCCTCGCTCTCGGCGTTCGCCGCGCCCCCGCCACTGCCGCGTGGAGGAGGCTCCGCGTGGAGGCGATCAGGACGCAGCGAACGGAGGTGCCCGTCGAGGAGTCCGCCCCCGCCAGGGACGCCGCCGCTGCCGCGCCCCTGGACGGAAACGGAGCCGGAGCGGACGGCTCCGTGGTTCCTTCCTCGGACGACAGCTGGGTTGTCAAGCTCGAGCAGTCGTTCAACATTTTCGCCACGGATTCGGTGATTATGGTACTCAAGGGCGTGTACGGTGATCGGTACTACGCCAGGTTCTTTGCGCTGGAGACGATTGCGAGGGTGCCGTACTTCGCATTCATATCGGTGCTTCACTTGTATGCGACCTTTGGATGGTGGAGACGAGCTGATTACATAAAGGTTCACTTTGCGCAGAGCTGGAACGAGTTCCATCACCTCTTGATCATGGAAGAATTGGGTGGCGACTCTTTGTGGTTTGACTGTTTTCTTGCTCGGTTTATGGCATTCTTTTACTACTTCATGACTGTTGCAATGTACATGCTGAGCCCACGAATGGCATATCACTTTTCCGAATGTGTGGAGAGACATGCATATTCCACCTATGATGAGTTCCTCAAGCTCCATGAAGAGGAATTGAAAAGACTACCAGCTCCAGAGGCAGCATTGAACTATTACATGAATGAGGACCTTTACTTATTCGATGAGTTTCAGGCATCAAGAACTCCAGGTTCTAGGAGGCCTAAAATAGATAACTTATACGATGTATTCGTTAATATACGAGAAGATGAGGCAGAGCACTGCAAGACAATGAAGACCTGTCAAACACATGGAAATCTTCGTTCTCCTCATTCAACGCCGAACTGCTTAGAAGATGATACGGAATGTGTAATACCTGAAAACGACTGTGAAGGTATTGTGGACTGTGTCAAAAAGTCCCTTACAAAGTAAATAGACTATATTCTTTAATGTGTTTGACTGTTTTCAGGTGTCGTAGTATACAAAGTATAAAATTTTGATGATATCCTCTCT
>3
GGCCCTCACTCTCCCTCGGTCCCTCGTCTCCACCTTCGCTCGCGCGCCGGCGCCGCTCACACCCGATTTCCACCTCACAAAACCCACACGGAACCCCGGCACCCCTCCACTCCACCAGGCGCCAGCGTCGTCATGGCGGTAGCCTCGACGTCGCCCCTTTCCGCCAAGCCCACCACGGCCACCTCGCCGCCCACTCCCGCATTCGGCTTCCTCGCTCCCCGCTCCCGCCGCGGCCGCGCCACCGCGTGGAGGAGTCTCCTCGTGGAGGCGATCCGGACGCAGCGGGAGAAACAGGGGACGGAGGGGCACTTCGAGGAGTCCGCCGCCGCGCCCCTGGACGGAGTCGGAGTCGGAGCCGATGACCCCGTGGTTCCCTCCTCGGACGCGAGCGACTGGGTGGTCAAGCTCGAGCAGTCCTTCAACATTTTCGCCACGGATTCGGTGATAATGGTACTCAAGGGCGTGTACCGTGATCGGTACTACGCCAGGTTCTTTGCGCTGGAGACGATTGCCAGGGTGCCGTATTTCGCGTTCATATCGGTGCTTCACATGTATTCGACCTTTGGCTGGTGGAGACGAGCGGATTATATAAAGGTTCACTTTGCGCAAAGCTGGAACGAGTTCCATCACCTCTTGATCATGGAAGAATTGGGTGGCAACTCTTTATGGATTGACTGTTTCCTTGCTCGGTTTATGGCGTTTTTTTACTACTTCGTGACTGTTGCGATGTACATGCTGAGCCCAAGAATGGCATATCACTTTTCTGAATGTGTGGAGAGACATGCATATTCCACCTATGACGAGTTCCTCAAGCTCCATGAAGAGGAATTGAAAAGACTACCAGCTCCAGAGGCAGCATTAGAATATTACCTGAATGAGGACCTTTACTTATTCGATGAGTTTCAGGCATCAAGAAGTCCAGGTTCTAGGAGGCCTAAAATAGATAACTTATATGATGTATTTGTCAATATACGAGATGACGAGGCAGAACACTGCAAGACAATGAAGACCTGTCAAACACACGGAAATCTTCGTTCTCCTCACTCAACGCAGAACTGCTTAGAAGCTGATACGGAATGTGTAATACCTGAAAACGATTGTGAAGGTATTGTGGACTGTGTCAAAAAGTCCCTTACAAAGTAAATAGACTATATTCTTTCGTGTGTTTCCAGGCGTCATAGTATACAAAGTACAAAATTTTTATGATATCCTCTTTTTTCCTATGTACCTTTTTAAGTAAAATCTGGGTCATATGTAAGAGAAGCCCAGTTTAGAACTTTAGATACAATTATGCAAC
2) Faz-se blastn para cada uma das sequências contra o banco de dados NR, Refseq RNA e se descobre a função. Geralmente se a espécie não for descoberta no banco de dados NR, Refseq RNA se fará um Blast contra o banco de dados de ESTs, TSA...
Sequencia 1: Populus trichocarpa; plastid terminal oxidase
Sequencia 2: Zea mays: plastid terminal oxidase
Sequencia 3: Sorghum bicolor: plastid terminal oxidase
Sequencia 4: Solanum lycopersicum: plastid quinol oxidase
Sequencia 5: Euphorbia esula: Alternative oxidase 4
>3 Usa-se a ferramenta de tradução do expasy.
Seq1
MAAAGLSSSSCTVFTLSSSFKTRRHLTKTPQNPLLFKRNNSSNRTSHPFLPSRKLCRVQATILREDEEKKVVVEESFQPKTFTHEPVRGSPQSSSPGGLETWAIKLEQSVNVFLTDSVIKILDTLYHDRDYARFFVLETIARVPYFAFISVLHMYESFGWWRRADYLKVHFAESWNEMHHLLIMEELGGNSWWFDRLLAQVIATSYYFMTVLMYALSPRMAYHFSECVESHAFATYDKFIKAQGDDLKKLPAPEVAVKYYTEGDLYLFDEFQTSRAPHSRRPKIENLYDVFLNVRDDEAEHCKTMKACQTHGNLRSPHSYPEDAFEDDTGCDLPQADCEGIVDCIKKSVTSPPSKQNI-
Seq2
MAVASTSPLSAKPATAPSPPAPGSGLLALGVRRAPATAAWRRLRVEAIRTQRTEVPVEESAPARDAAAAAPLDGNGAGADGSVVPSSDDSWVVKLEQSFNIFATDSVIMVLKGVYGDRYYARFFALETIARVPYFAFISVLHLYATFGWWRRADYIKVHFAQSWNEFHHLLIMEELGGDSLWFDCFLARFMAFFYYFMTVAMYMLSPRMAYHFSECVERHAYSTYDEFLKLHEEELKRLPAPEAALNYYMNEDLYLFDEFQASRTPGSRRPKIDNLYDVFVNIREDEAEHCKTMKTCQTHGNLRSPHSTPNCLEDDTECVIPENDCEGIVDCVKKSLTK
Seq3
MAVASTSPLSAKPTTATSPPTPAFGFLAPRSRRGRATAWRSLLVEAIRTQREKQGTEGHFEESAAAPLDGVGVGADDPVVPSSDASDWVVKLEQSFNIFATDSVIMVLKGVYRDRYYARFFALETIARVPYFAFISVLHMYSTFGWWRRADYIKVHFAQSWNEFHHLLIMEELGGNSLWIDCFLARFMAFFYYFVTVAMYMLSPRMAYHFSECVERHAYSTYDEFLKLHEEELKRLPAPEAALEYYLNEDLYLFDEFQASRSPGSRRPKIDNLYDVFVNIRDDEAEHCKTMKTCQTHGNLRSPHSTQNCLEADTECVIPENDCEGIVDCVKKSLTK-Seq4
MAISISAMSFGTSVSSYSCFRARSFEKSSVLCNSQNPCRFNSVFPIRKSDGASRCSVSRKSCRVRATLLQENEEEVVVEKSFAPKSFPDNVGGGSNGKPPDDSSSNGLEKWVIKLEQSVNILLTDSVIKILDTLYHNRNYARFFVLETIARVPYFAFISVLHMYESFGWWRRADYMKVHFAESWNEMHHLLIMEELGGNAWWFDRFLAQHIAIFYYFMTVLMYALSPRMAYHFSECVESHAYETYDKFIKDQGEELKNLPAPKIAVDYYTGGDLYLFDEFQTSREPNTRRPKIDNLYDVFMNIRDDEAEHCKTMKACQTHGSLRSPHTDPCDDSEDDTGCSVPQADCIGIVDCIKKSVTDTQVTKR-
Seq5
MAMSISPAAHPTSISIRNCKTLSPFPRQNVLRYNPLSPRRSKVSGKLFRVQATVLRENDEKVVVEETFQPKTSTNEERKGGDGEPQDDSSSGTLEKWVIKLEQSINIFLTDSVIKVLDSLYHDRDYARFFVLETIARVPYFAFISVLHMYESFGWWRRADYLKVHFAESWNEMHHLLIMEELGGNAWWFDRFLSQHIAVVYYFMTVLMYAISPRMAYHFSECVESHAFSTYDKFIKAQGEELKKLPAPEVAIKYYTGGDLYLFDEFQTSRAPNT
4) Apenas 1 contig foi encontrado usando a isca 5.
>Contig1
GACCCTCTTAGGCGGTTGACACCTTCCCTTATACCAACTGACCGAAACCAATTCCAATTTGATTTCCATTTTCTGGCTTTTGAATTTTCGTTTCGGTCTCTGCAAAAAACCGTCGGGAATGGCAGCGGCGACTTTGTCTTCCACGGTGTTTGCAATCTCAACCTCCTGCTCTTCTTCGTCTCTCAAAGCAAGGAACTTCAAGAACTTAGCTTGCTCCACGTTTTGTTCTCAGAATCGTATTCCTTACAATCCCATTTCTGCTCGTCCATCAATTTCCAGAATTCGTGCAACTATTCTGCAAGAAGATGAAGAGAAAGTGATAGTGGAGGAATCCTTTGAGTTCAAGGCTTCTTCTCCCTCAGATGAAGTGAAATCAAGCAGTGGGGGCCCATCAGAAAGTTCGTCTTCAAGTACTTTTGAGGGCTGGGTTATTAAATGTGAACAAACCATCAACATCTTCCTTACGGATACAGTCATAAAGATACTTGATACTCTGTACCGTGACCGAGATTATGCAAGGTTTTTTGTACTGGAAACCATTGCAAGGGTTCCTTACTTTGCCTTTATGTCTGTTTTGCACATGTATGAGAGTTTTGGCTGGTGGAGAAGAGCAGATTATCTGAAAGTGCATTTTGCTGAGAGCTGGAATGAGATGCACCACTTGCTTATCATGGAAGAATTGGGGGGAAATGCTTGGTGGTTTGACCGCTTTCTTGCTCAGCATATTGCGATCTTCTATTATTTTATGACGGCCTTTATGTATATAATAAGCCCAAGAATGGCATATCACTTCTCTGAATGTGTAGAGGGCCATGCATTTTCAACTTACGACAAGTTTATCAAGGCCAGAGGAGAGGATTTGAAAAAGTTGCCTCCTCCTGAGGTTGCTGTAAAATACTACACTAGCGGTGACTTGTACTTATTTGATGAATTTCAAACTTCCAGAGCTCCCAATTCTCGAAGGCCAAAAATAGAGAATTTGTACGACGTGTTTCTGAACATAAGAGATGACGAAGCTGAACACTGTAAGACGATGAAGGCTTGCCAGACTCACGGGAACCTGTGGTCTCCTCATTCCCGTGCAGAAGAAGAAGATGACGCCCCGTGCATCATTCCTCAAACAGACTGTGAAGGTATTGTAGATTGTATAACAAAATCCGTGACAACAAAGCCAGAAAATTGATCATATAAGACACAAGAGGGGACGAGGAAATTAGGTAGAAGTAGTGAATGGTATAAGAAATTTGTTTGTATACACAAGAATGTATAGATCCAAAAGAAGCATTTGGTTATCCAATGAAGGTTGCTTTTCT
Esse contig apresenta as regiões 5’UTR, 3’ UTRs (em cinza) e o ORF (open reading frame) que começa com o ATG (codon da 1ª metionina – Verde) e termina no TGA (codon de parada – vermelho). Para o cDNA ser considerado completo ainda seria necessário a cauda poli A na extremidade 3’ que não foi encontrada.
OBS. Havendo mais de 1 contig deveria ser feito um alinhamento da isca 5 com os contigs para saber qual o mais semelhante.
5.
Alinham-se os cDNAs usando o clustalw e em seguida verifica-se as relações filogenéticas.
Há 2 grupos filogeneticamente próximos: grupo1 (seqs 2 e 3) grupo 2 (seqs 1, 4 e 5)
De fato as sequencias 2 e 3 são monocotiledôneas (Zea mays e Sorghum bicolor) enquanto as sequencias 1, 4 e 5 são dicotiledôneas (Populus trichocarpa, Solanum lycopersicum, Euphorbia esula).
6.
Trata-se do gene da actina da espécie Phaseoulus vulgaris
7. 
Esse gene possui 5 exons interrompidos por 4 introns
8.
aaggacacgcgtgaagaacaacacgaatgtctcgttgaggcaatgtgctgagaacgtggacaccttcatttaaaaacccaaactcattgaaacttgaggtttgatttgggcgtccaatcccgtgtatcagttttcgccttcagattttgaaagacaatggccgatgctgaagatattcagccactcgtctgcgataatgggacaggaatggttaaggctggatttgccggtgatgatgctccgagggctgtgtttcccagcattgttggtcggccacgtcatactggtgtgatggttgggatggggcaaaaggatgcatatgttggggatgaggctcagtccaagcgtggtatactgactctgaaatatcccattgagcacggtattgtgagcaactgggatgacatggagaagatctggcatcacaccttctacaatgaactccgtgtggccccagaggagcaccctgttctgctcactgaagcccctctcaatccaaaggctaatcgtgagaaaatgacccaaatcatgtttgagaccttcaacacacctgccatgtatgttgccatccaggctgttctgtccctctatgccagcggtcgtacaactggcattgttctggactctggagatggtgtcagccacactgtacccatttatgaaggctatgccctccctcatgccatcctccgtcttgacttagcaggacgtgacctcactgatgccctgatgaaaatcctcactgagcgtggttactccttcactaccacagctgagagagaaattgtaagggatatgaaggaaaaattggcttatattgcccttgattatgaacaagagcttgaaacttcaaaaactagttcatctgttgagaagagctatgagctgcctgatggacaggtcatcaccattggtgccgagcgcttccgttgtcctgaagttctcttccagccatccatgattggaatggaagctgttggcattcatgagacaacatacaattcaattatgaagtgtgatgttgatatcaggaaggatttgtatggaaacatagtcctcagtggtggttctaccatgttccctggtattgcggatagaatgagcaaggaaatcactgccctggccccaagcagcatgaagatcaaggtggttgctccacccgagagaaagtatagtgtctggattggtggctctatcttggcttctctcagcactttccaacagatgtggattgcgaaggcagagtacgatgagtccggtccttcaattgtacacaggaaatgcttttaaatgcagcagaacggcttggtgtcctcgattggcaatgcagtagccccttgcaatgattgaagatctttcttaaaaaatgtttgtgttgttagaacagtgtactaagataaagtggagttttatggcagaattaattctttcatttgttttgttttaacattgttaattgttgaacagttgatcttt
As regiões 5’UTR, 3’ UTRs estão em cinza.
O ORF (open reading frame) que começa com o ATG (codon da 1ª metionina – Verde) e termina no TGA (codon de parada – vermelho) está em amarelo com as indicações das junções exon/exon em rosa.
Proteina deduzida
MADAEDIQPLVCDNGTGMVKAGFAGDDAPRAVFPSIVGRPRHTGVMVGMGQKDAYVGDEAQSKRGILTLKYPIEHGIVSNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVLDSGDGVSHTVPIYEGYALPHAILRLDLAGRDLTDALMKILTERGYSFTTTAEREIVRDMKEKLAYIALDYEQELETSKTSSSVEKSYELPDGQVITIGAERFRCPEVLFQPSMIGMEAVGIHETTYNSIMKCDVDIRKDLYGNIVLSGGSTMFPGIADRMSKEITALAPSSMKIKVVAPPERKYSVWIGGSILASLSTFQQMWIAKAEYDESGPSIVHRKCF
9.
Região do promotor (1000 pares de base 5’ UPstream)
Aaatgtatggacatggctcaatggagttcaattgtctcttgatcatgaatttcatattacgaagtagactttaagcctaagtcaactcctagtcgatgtgggatctttaatacacccctcacgccgagagtgacagttatgactctgataccatatcacgaagtagactttaagtctaactcaaccccataaaatcgactcatggagtgaggttcacaaccacttatatacaatgaaaatttcataatctctaattgatgtgggatctccaacatactttgtaagtgtgatgcggaacacactttataagtgtggtgcaggtaggtcgggagaccggataatcgagtagtaaaatatagttgagcctccaatagtacacaagattgtaagagttttgttttcaagaacatcatattctcacattggttaagttttgaaattgaatgaaaaataagcaaaattggataaaaatataatgtagtgtaggttgaaggaaagcaaaatatttagattaaaaaggataaaaagagtgagaaaagctatggatagaggggatggaagtgcaatgcaagtgagcccagaattttggggcaaacgaatgagcatggaggcaaagaggcaaagaggcagcaggtgcagtgcatgtcgttttgcatggcctgatgattcgtgaaggaatttgttttgtcgtttagtggaaacattatcgtttgttcccatattttgttgaggaaaagacaacacatgtctatgttggtccgtcacagaaagaattgcattgtggtgagaacgtgcagtcttattttctttcttgcatcatttatcatattttggcctataccaacgaacgtaccgttttttgtttttttgcaaaagccatgtttgcatgagacgacccagtgggtggaaagaaggcccaaatggatagtgaagaaatagcggaagtggtaaaaatgggccaagctaataaaagcaatgtgtgtgcggagccacagaagga
Submeter a região do promotor a bancos de dados específicos de elementos cis (elementos regulatórios)
http://www.dna.affrc.go.jp/PLACE/
http://bioinformatics.psb.ugent.be/webtools/plantcare/html/
http://plantpan.mbc.nctu.edu.tw/
Dados obtidos com o PLantCare (se possível usar internet explorer)
Vários elementos responsivos a luz foram encontrados
Como Box I; G-Box; GT1-motif; I-box; SP1; TCT-motif.
10.
10) Quantos genomas foram publicados no periodo de maio de 2007 a agosto de 2015 para as seguintes espécies:
Ir ao NCBI; escolher genome, buscar advanced
Plantas (Viridiplantae: todas as plantas e algas; apenas plantas = streptophyta)
Search
Resposta: 147 genomas de plantas foram publicados no periodo de maio de 2007 a agosto de 2015
Monocotiledôneas (monocots)
33 genomas
Eudicotiledoneas (eudicots) 
107 genomas
Algas 
Fazer uma busca para Viridiplantae menos streptophyta
Resposta: 17 genomas

Outros materiais