Why Classes

A note on modularity and encapsulation...

Why orfs and codons begs to be implemented as classes

Constructing a Codon class

To construct the class use the class keyword - in much the same way you use the def keyword when defining a function.

# a class - a custom compound data type with functionality
class Codon:
    pass

pass is a python keyword that is just means "do nothing". It is needed here because python requires something inside the class statement.

Creating an object

Your Codon class is a Codon object factory. Use it to make a Codon object like this:

# making an object
codon = Codon()
print codon

As you can see when you print it, codon is an instance of the Codon class, i.e. a Codon object.

Attributes

Classes and objects have attributes. You can make an aa and a triplet attribute like this:

# attributes
codon.triplet = "ATG"

from exerciseWeek4 import aminoAcidMap

codon.aa = aminoAcidMap["ATG"]

print codon.triplet
print codon.aa

Just like a method is a function defined inside an object, an attribute is a variable inside a class. Note the .something syntax for creating/setting and accessing attributes.

The special method for Initializing

Now, we don't want to have to set the triplet and aa data in every object we make in this way. It would be nice if this happended automatically every time we made a new object using the Codon class. To do this we need to define an initialization method for the class. This method is always called __init__ so that Python recognizes it as a the function to be used for initialization. A method is defined in the much same way as an ordinary function is. What makes them different is the following three things:

Lets write an __init__ method that sets the two attributes. Note how the argument "ATG" given to the class for constructing the codon object is passed to the __init__ method as the second parameter. The first parameter is always self.

# the initialization method
class Codon:

    # first argument is the object that calls the method - self by convention
    def __init__(self, triplet):
        self.triplet = triplet
        self.aa = aminoAcidMap[triplet]

codon = Codon("ATG")
print codon
print codon.triplet
print codon.aa

This __init__ method is only called when the object is contructed and only serve to do the initial things to create the object the way we want it.

A method with no arguments

Task: write a method that prints the aa attribute.

class Codon:

    def __init__(self, triplet):
        self.triplet = triplet
        self.aa = aminoAcidMap[triplet]

    def printAA(self):
        print self.aa

codon = Codon("ATG")
codon.printAA()

Notice how the printAA method takes no arguments but still takes one parameter because self (the object it is called on) is always the first parameter of a method.

Calling the printAA is pretty much like saying "Hey codon - print your amino acid!". Notice how the perspective is different from using a function - where you would say "Hey function - print the amino acid of this codon onbject!"

A method with argument(s)

Task: write a method isSameAA that returns true if the AAs are the same or False otherwise:

class Codon:

    def __init__(self, triplet):
        self.triplet = triplet
        self.aa = aminoAcidMap[triplet]

    def isSameAA(self, other):
        return self.aa == other.aa

    def printAA(self):
        print self.aa


codon1 = Codon("GTT")
codon1.printAA()

codon2 = Codon("GTC")
codon2.printAA()

print codon1.isSameAA(codon2)

If you call the method like this: codon.isSameAs(aa) the self parameter receives the value of codon and the triplet receives the value of aa.

Operator overloading

You may have wondered why all the following makes sense in Python:

4 == 5                # returns False becasue numbers are the same
[1,2,3] == [1,2,3]    # return True because all elemetns are the same
"kasper" == "munch"   # returns False because the strings are not identical

How does Python know what is "meant" when we test the equality of such different things? The answer is that python interprets x == y as x.__eq__(y). As you know everything is an object in Python. What happends is that Python looks in the x object for a method called __eq__ (eq for equal). If it finds it, as it does for integers, lists, strings any many other objects, it runs that method and returns whatever that method returns. If it does not find it it complains. This means that we can write our own __eq__ method what happens when two codon objects are compared with ==. Actually we already did this with out isSameAs method, so all we need to do is rename it to __eq__. And voila, now we can write codon1==codon2 and Python will know that what we mean is, weather the two codons code for the same amino acid. We have overloaded the == operator with new meaning

The same kind fo thing happens when print and object . A way to tell Python what to print is by defining the __str__ method. This method should return (not print). What print is supposed to print. We did this already too. All we need to do is rename printAA to __str__ and make it return the aa string attribute instead of prining it.

class Codon:
    def __init__(self, triplet):
        self.triplet = triplet
        self.aa = aminoAcidMap[triplet]

    def __eq__(self, other):
        print self.aa, other.aa
        return self.aa == other.aa

    def __str__(self):
        return self.aa


codon1 = Codon("GTT")
codon2 = Codon("GTC")

print codon1, codon2

print codon1 == codon2

Constructing an ORF Class

Using classes in classes...

Task: write an __init__ method that takes an argument that is an orf sequence. The __init__ method should split the orf into codons and make an attribute that is a list of Codon objects. To help you, here is the code you used in the ORF exercise for the same purpose:

seq = "ATGGTTTAG"
codonList = []
for i in range(0, len(seq), 3):
    codonList.append(seq[i:i+3])

Once you have done this, implement

print '=' * 30
class ORF:

    def __init__(self, seq):
        self.codonList = []
        for i in range(0, len(seq), 3):
            s = seq[i:i+3]
            c = Codon(s)
            self.codonList.append(c)
        #self.codonList = [Codon(seq[i:i+3]) for i in range(0, len(seq), 3)]

    def __str__(self):
        s = ''
        for codon in self.codonList:
            s += str(codon)
        return s

    def __eq__(self, other):
        return str(orf) == str(orf)


orf = ORF("ATGGTTTAG")
print orf

orf1 = ORF("ATGGTTTAG")
orf2 = ORF("ATGGTATAG")
print orf1 == orf2

Now that you have learned how Python knows how to test the equality of pairs of very different things, you probably also have a clue how it knows how to iterate over very different things like strings, lists and dictionaries.

for character in "kasper":
    print character

for element in [1,2,3,4]:
    print element

for key in {1: 5, 6: 2}:
    print key

The secret is that when an object is placed after the in operator, like it is in a for loop, Python looks for an __iter__ method that must return an object that has a method called next. In our case we just return self - the object itself. But doing that we need to make sure then than the object has a next method that returns the things we want to return one by one. For every iteration of the for loop next is called in the object. Here we implement it so it it returns each amino acid in turn. We use the idx attribute to keep track of which element we are at. Once there are no more elements to return we tell Python by writing raise StopIteration. You don't have to know why - but you can look it up.

print '=' * 30
class ORF:

    def __init__(self, seq):
        self.codonList = []
        for i in range(0, len(seq), 3):
            s = seq[i:i+3]
            c = Codon(s)
            self.codonList.append(c)
        #self.codonList = [Codon(seq[i:i+3]) for i in range(0, len(seq), 3)]

    def __str__(self):
        s = ''
        for codon in self.codonList:
            s += str(codon)
        return s

    def __eq__(self, other):
        return str(orf) == str(orf)

    def __iter__(self):
        self.idx = 0
        return self

    def next(self):
        if self.idx < len(self.codonList):
            i = self.idx
            self.idx += 1
            return self.codonList[i]
        else:
            raise StopIteration

orf = ORF("ATGGTTTAG")
print orf

orf1 = ORF("ATGGTTTAG")
orf2 = ORF("ATGGTATAG")
print orf1 == orf2


for codon in orf:
    print codon

for codon in orf:
    print codon, codon.triplet

Index

Contact info

Office address:
Bioinformatics Research Centre (BiRC)
Aarhus University
C.F. Møllers Allé 8
DK-8000 Aarhus C
Denmark
Office phone:
+45 871 55558
Mobile phone:
3013 8342
Email:
kaspermunch@birc.au.dk