|
楼主 |
发表于 2009-7-24 20:29:29
|
显示全部楼层
比如我要从下面文件中把含有 Calponin protein 这个字符的序列查找出来并且将其拷贝到另外一个文件中保存(每个序列都是以>号开头),输入的文件太大30多个G,只能截取极少部分序列。谢谢你拉!!
>gi|159498513|gb|EY195367.1|EY195367 RSAA-aab21e07.g1 R.similis_EST_RSAA Radopholus similis cDNA 5' similar to ref|NP_491282.1| Calponin homology (CH) domain containing protein family member [Caenorhabditis elegans] pir|T29467 hypothetical protein F28H1.2 - Caenorhabditis elegans gb|AAB52338.1| Calponin protein 3 [Caenorhabditis elegans], mRNA sequence
GGCCGGGTCGCGAAGAGCCAGGGAGTGCCCACCGAGGAGACCTTCCAGAGCGTGGACTTGTTCGAGGCAC
GTGACCTCTACTCCGTGTGCATGACCCTGTTGTCGTTGGGCCGAATTATGGAGAAGAAGGGAAAGCCGAA
CCCATTCTCTGGATGAAGAGTAGAAGTGAGTGCAGCAAAAGACGGACAGAGCATGTGCTATCGCTCCCAT
TCGAGAACTCCCCGTTTTGCGAATTTTCTCCCCGTGTGCGACCTTGCAACAATACCGAGGCGTTAACTGT
TTTCCCCTCTCTCTCCTCTCAAACTGTGGGCATTTGAAAATAGCGATGCCGGAATAAATGGCCAATTCCA
>gi|159498512|gb|EY195366.1|EY195366 RSAA-aab21e06.g1 R.similis_EST_RSAA Radopholus similis cDNA 5', mRNA sequence
GGCCGGGATTGGCACGTAATTCAACTGGTCTTTTTGCTGGGCGGCAATTGGCAACTTTATCCGACAGAAG
CATGATAATTGGTGGAGCCGCCCTTTGCCGACGCACTTTTGCAGCCCCACCCCCCCATCGGGACACGGTA
TTCGAGGTAGAATCGGACGAGGACTTCGAGAATGGGGTATACCATGCCGAAAAACCAGTGCTCATGCAGT
TTTACGCCGATTGGTGTGGACCCTGTCAGAATTTGGCACCGAGGTTGATTGCCAAAGTGAATGGACAGGA
TGGGAAGGTATTGCTAGCGAGAGTGAATGTAGAGGGTTCCGCCGGATTTCTCGCGGAACAGTTTGATGTA
AGCTCGATTCCCACTGTGATGTGCTGGCTGCGAGGAGAGGTGGTTGACCGTTTTGAGGGCGACGTGGAAG
ACACGAAGATTGACCAAATCATATCCAAATTGGTGGAATATCAAACTGGAAATGAATGATAAAGGGCTTG
AACAGCCATTTAGTAGACGAAAAAAAAAAAAAAAA
>gi|159498511|gb|EY195365.1|EY195365 RSAA-aab21e05.g1 R.similis_EST_RSAA Radopholus similis cDNA 5', mRNA sequence
GGCCGGGTGCTGCAGGCATGTCGCACCGTCCCGCCACTCATCATCGTCCCTTCCCTGTCCCGCCCCGGCA
CACCCTTCCTGTCCAGACATTCAACTGCGGTGTGGTCGAAGTCTCTCCCAAGTCAATACCCGATGCTCCG
CCGCCATACGAGGAGTTTGTGCGTGTTCCACCACCACCGCCACAAAGGGCACCGCCCATTCTGACGCGGG
AAGAGGATGAGGAGTTGCAGAACAGACTGAACTCGGAGAGAGAGAGGGAACTGAGCGACTGGTGACATTT
GGTTTTGTCGAGTGCTGCAGCTTCGCACCATTTCCCTTTATATACGGGACTTTCTTCATTTCTTTTGTTC
CTGACTTAACAACAATTAATAGACCA
>gi|159498510|gb|EY195364.1|EY195364 RSAA-aab21e04.g1 R.similis_EST_RSAA Radopholus similis cDNA 5' similar to ref|NP_500582.1| GTP-binding protein like (21.7 kD) [Caenorhabditis elegans] sp|Q23445|SAR1_CAEEL GTP-binding protein SAR1 pir|T29706 GTP-binding protein ZK180.4 [similarity] - Caenorhabditis elegans gb|AAB52968.1| Hypothetical protein ZK180.4 [C, mRNA sequence
GGCCGGGGGACGCCATTGTATTTTTGGTCGATGTAGCCGACCTGGAACGTATTCAGGAAGCAAGGGAGGA
ATTGTGGAGTCTGATGCAGGATGAACAGGTGGCAAGTGCACCTGTGCTTGTTTTGGGCAATAAGATCGAC
AAGCCGAATGCTCTCAGCGAAGACCAGCTCAAGTACTACCTCGGCATCCAACAATACTGCACAGGAAAAG
GCCAAGTTGCGCGCTCAGATCTGGCCACTCGTCCTTTGGATGTGTTCATGTGCTCAGTCCTTAGGCGACA
GGGTTACGGCGAAGGATTCTGTTGGCTCTCACAATATCTGGACTGATTGAACGCGCCTCGGAAGTTGAAA
ATTGACACAAAAAGTAAGGACGACTCCAATCGCAACAAATCATTTCATATTATTTCTGTACTACACCTAT
TTTCGATTCATCTTATCTCTTAAACAATGTCAATGTTAAAAATCATCGGTTGCA
>gi|159498509|gb|EY195363.1|EY195363 RSAA-aab21e02.g1 R.similis_EST_RSAA Radopholus similis cDNA 5' similar to gb|AAP59456.1| cathepsin B precursor [Araneus ventricosus], mRNA sequence
GGCCGGGGCTCGGGGGCCATGCCGTTCGCATTATTGGATGGGGCGAGGCTAGCGGTCAGAAATACTGGCT
GGTGGCTAATTCGTGGAACACCGATTGGGGCGAGAAGGGCCTATTCCGCATACGTCGCGGCTCCGATGAA
GAGCGCATCGAAACATTGCAAATTGCATTTGGGACACCAAAGATTTAAAATCGGCGAATTGACTTGTAAA
AGATGGATAGTAAAATATTTCTTTTGCCA
>gi|159498508|gb|EY195362.1|EY195362 RSAA-aab21d12.g1 R.similis_EST_RSAA Radopholus similis cDNA 5', mRNA sequence
GGCCGGGGACATTAAGTGCAATCAATTCGCCACATAATGTGATGTCAAAATTTAAATCTGAAACTTGGAT
CTATTGTAACAACATCGGATGGACCATCAGCTGGTGGGAGCAAGCAACAAGAGATGGAAGACGATACACC
GGCCAAAGAGCAAGAGGAAGAACGCGAACTGGGCGAAGAGGATGGATTTGCGCAAAACCATCACAATAGT
CAATTTTCGTAGCCTCCCAACGCCAACAGCCGCCTTTCTGGCACATTATGTGAAGAGTGATCGTCCATTC
CATGCGCTGTTCCGTCGTCGACCTGTTCCTGCGCACTTGGGGGAGACCAAGGCCAAATCGATGTAATTTA
ATTGAACAAAAAATTAATGCAGTTCACGGCTTGTCTTTCATGCCTTGGATGAACTCTTCATTCATTCGGA
ATCAACCATGGCCACGTTACGTCAACTGGAAGATTTGCCGGAGAATGTGCTGGCTAGACGGAGGCTCCAG
ACAGTTCGAGCCAACGAACTCGTCAATTAGAGAAACCACAAAATATAAAGTTGACATTTTATGAATAAAT
ATATGAAAAAAAAAAAAAAAA
>gi|159498507|gb|EY195361.1|EY195361 RSAA-aab21d11.g1 R.similis_EST_RSAA Radopholus similis cDNA 5', mRNA sequence
GGCCGGGATTTTTTGTAATTATGATTTTAGGTTATAAATAATTAATGAAGTATAAACTATTAATGTTATA
TTTTTTAGATAAATTAATTTTTCTAGTAATTTATTTTTAGATAAATTAATGTTCCAGAAATATCGGCTAG
ACATTATTATTTTTAACTAAAAACTTTTTAAATTTTATTTTAATTTATATAAATTTATATTAATATAGGT
GAAATTTTAATTATAATTATGTTAATAATTTTATAAAATTTAAAATTTTAATTTTTAACTTAGGTTAGAC
ACTAATTAATGAAATTTAATAATTTTCTTTAGTAAATTTTTGA
>gi|159498506|gb|EY195360.1|EY195360 RSAA-aab21d10.g1 R.similis_EST_RSAA Radopholus similis cDNA 5' similar to ref|NP_491217.1| Forkhead associated domain containing protein (35.8 kD) [Caenorhabditis elegans] pir|T25596 hypothetical protein C32E8.5 - Caenorhabditis elegans gb|AAB42323.1| Hypothetical protein C32E8.5 [Caenorhabditis elegans], mRNA sequence
GGCCGGGGAAAGAGCCAGGCGAAGAAGACACGAAAGGAATGGGCCCAAGTGAAGAGGAGAAGGAAAAACC
GTCTTTTGTGCCCAGCGGAAAGTTGGCTAAAGACACCAACACATTCAAAGGAGTCCTCATCAAGTACAAT
GAACCGCCAGAAGCCAAGATTCCCAAGTTGCGTTGGCGCATGTATCCGTTCAAGGGAGAGCAAGACATGC
CTGTGATCTATGTGCACCGTCAGTCAGCCTATCTGGTTGGGCGGGACCGAAAAATTGCCGATTTTCCCGT
GGACCATCCGAGTTGTTCAAAGCAGCACGCAGCACTCCAGTATCGGTCTCTG |
|