The clustalw python module contains code for wrapping the ClustalW multiple sequence alignment tool. Bundled with the ClustalW wrapper is also modules for reading FASTA and GDE files.
The class ClustalW wraps the Clustal W tool; it manages the sequences to be aligned and the options passed to Clustal W. The sequences can be passed to ClustalW objects when constructing the objects:
import clustalw,fasta
c = clustalw.ClustalW([rec for rec in fasta.fasta_itr('test.fasta')])
Or they can be added to the object on at a time
import clustalw,fasta
c = clustalw.ClustalW()
for rec in fasta.fasta_itr('test.fasta'):
c.add_sequence(rec)
Options to Clustal W can be set though the property options like this:
c.options = '-type=DNA'
And the alignment can be calculated using the align() method:
a = c.align()
The alignment is only calculated if the set of sequences or the options for Clustal W have been updated since the last call to align(); otherwise the alignment from the last call is returned.
The returned alignment is wrapped in a class that permits a few operations for analysing the alignment and for formatting it for output. For more details, import the module msa and use help(msa).
Download the source code and unpack it (tar xzf clustalw-a.b.tar.gz, where a.b is the version number of the wrapper). Then run the setup script:
python setup.py install
See python setup.py --help for more details.
Thomas Mailund, <mailund@birc.au.dk>, Bioinformatics Research Center, University of Aarhus.
Time-stamp: "2006-01-26 21:57:18 mailund"