[Back] [FAQ] Fasta header replacer - help page
 

It is highly recommended to try the example button on all services.

The fasta header replacer is able to perform several tasks:

  1. Replace all headers one by one. So the first sequence get the first new header, the second sequence gets the second header and so on...
    This is the default behaviour and only requires two inputs: Sequences and new headers.

  2. Replace all headers by non-exact matching between search terms and the old headers. This is more tricky but may be useful for some. If your headers look like this:
    >gi|109693353|gb|DQ535174.1| Homo sapiens isolate YM3-CON-14-VH3-13 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693333|gb|DQ535153.1| Homo sapiens isolate YM3-CON-4-VH3-59 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693331|gb|DQ535129.1| Homo sapiens isolate YM2-CON-14-VH3-29 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693329|gb|DQ535128.1| Homo sapiens isolate YM2-CON-14-VH3-78 immunglobulin heavy chain variable region gene, partial cds
    
    And I would like to change them to
    Globulin_DQ535129
    Globulin_DQ535128
    Globulin_DQ535174
    Globulin_DQ535153
    
    I can search the old headers using
    DQ535129
    DQ535128
    DQ535174
    DQ535153
    
    as search terms. NOTE that the order of the sequences is ignored (i.e. your sequences doesn't need to be in the same order as your headers - as long as your search terms and new headers are in the same order (typically from a spreadsheet).

  3. Replace all headers by EXACT matching between your search terms and the old headers. Same as above, but it requires that your search term matches the entire old header. In the above example nothing would be matched, but if you searched
    >gi|109693353|gb|DQ535174.1| Homo sapiens isolate YM3-CON-14-VH3-13 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693333|gb|DQ535153.1| Homo sapiens isolate YM3-CON-4-VH3-59 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693331|gb|DQ535129.1| Homo sapiens isolate YM2-CON-14-VH3-29 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693329|gb|DQ535128.1| Homo sapiens isolate YM2-CON-14-VH3-78 immunglobulin heavy chain variable region gene, partial cds
    
    with
    gi|109693331|gb|DQ535129.1| Homo sapiens isolate YM2-CON-14-VH3-29 immunglobulin heavy chain variable region gene, partial cds
    
    as search term and
    Homo sapiens isolate YM2-CON-14-VH3-29
    
    as replace term it will give you some warnings and the following output:
    >gi|109693353|gb|DQ535174.1| Homo sapiens isolate YM3-CON-14-VH3-13 immunglobulin heavy chain variable region gene, partial cds
    >gi|109693333|gb|DQ535153.1| Homo sapiens isolate YM3-CON-4-VH3-59 immunglobulin heavy chain variable region gene, partial cds
    >Homo sapiens isolate YM2-CON-14-VH3-29
    >gi|109693329|gb|DQ535128.1| Homo sapiens isolate YM2-CON-14-VH3-78 immunglobulin heavy chain variable region gene, partial cds
    

    Tadaa! It found and changed the header in sequence number 3.