+1 tag for Google+

Tuesday, 13 December 2016

Python Sorting Part 2: Classes and Attributes

Python Sorting with Classes and Attributes

# Here we define a class, containing student names
# each student contains an list attribute of varying from 1 to 7 in length
# populated by random integer values from 0 to 500


import random # imports the random module

# declares the class object
class myClass(object):
    def __init__(self, inputName):
        self.myName = inputName # creates a myName attribute
        # create a list of random length
        self.myList = [random.randint(0,500) \
                        for x in range(random.randint(1, 7))]
        
    def greet(self):
        # prints out name and list of values the instance holds
        print 'Hi I am {}, my list is {}'.format(self.myName, self.myList)

random.seed(0) # providing a seed to generate consistent random values
# Now we create instances of myClass for 5 students,
# and store the result as a list in studentNames
studentNames = ('tommy', 'sally', 'summer', 'jane', 'jonathan')
# For each item in studentNames, create a myClass instance
classList = [myClass(x) for x in studentNames]

# What kind of data does the list contain? Let's find out
print classList
# Result: [<__main__.myClass object at 0x0000000057D90470>,
# <__main__.myClass object at 0x0000000057D90320>,
# <__main__.myClass object at 0x0000000057D90240>,
# <__main__.myClass object at 0x0000000057D906A0>,
# <__main__.myClass object at 0x0000000057D906D8>] # 

# They are instances of myClass, that each hold name attribute values of
# 'tommy', 'sally', 'summer', 'jane', 'jonathan' in that order respectively
# They will also each have a list containing 1 to 7 items which are integers

# Loop through each myClass instance and call their greet() method
for person in classList:
    person.greet()
# Result:
# Hi I am tommy, my list is [379, 210, 129, 256, 202, 392]
# Hi I am sally, my list is [238, 292, 454]
# Hi I am summer, my list is [141, 378, 309, 125]
# Hi I am jane, my list is [492, 405, 451, 155, 365, 450, 342]
# Hi I am jonathan, my list is [50, 217, 306, 457] #

# return just the names stored in myClass.myName attribute
[classList[x].myName for x in range(len(classList))]
# Result: ['tommy', 'sally', 'summer', 'jane', 'jonathan'] # 

# sorting the classes (and returning their myName values 
# by the length of their myName strings sorted([x.myName for x in classList], key=len)
# Result: ['jane', 'tommy', 'sally', 'summer', 'jonathan'] # 
# 'jane' is the shortest string and 'jonathan' is the longest string, correct ascending order

# return the name stored in myClass.myName attribute
# in the order of the length of the list contained in their myList attribute
y.myName for y in sorted([x for x in classList], 
  key=lambda(student): len(student.myList))]
# Result: ['sally', 'summer', 'jonathan', 'tommy', 'jane'] # 
# 'sally' has a myList attribute containing 3 items
# 'jane' has a myList attribute containing 7 items
# correct ascending order

# we can find out the sum of the items in each instance's list
[(x.myName, sum(x.myList)) for x in classList]
# Result: [('tommy', 1568),
 ('sally', 984),
 ('summer', 953),
 ('jane', 2660),

 ('jonathan', 1030)] # 


# Now we shall try to sort the class instances 
# ordered by the sum of their list items
[y.myName for y in sorted([x for x in classList], key=lambda(inst):sum(inst.myList))]

# Result: ['summer', 'sally', 'jonathan', 'tommy', 'jane'] # 
# 'summer' has a list with the sum of 953
# 'jane' has a list with the sum of 2660
# This is the correct ascending order according to each instance's sum

# When we enclose list comprehensions like that it is very confusing 
# and may be difficult to read.

# Here's that statement again, with colour coded blocks that we can examine in parts
[y.myName for y in sorted([x for x in classList]
key=lambda(inst):sum(inst.myList))]

# Let's break it down, starting from the inner most list comprehension:
[x for x in classList]
# This gives us the list of myClass instances in the order of the 
# studentNames list, which the class instances were created

# Expanding from that, we come to this statement
sorted([x for x in classList], key=lambda(inst):sum(inst.myList))
# We sort this list of class instances by using the key argument
# The lambda function returns to the sum of the instance's myList items
# the inst variable passed into the lambda function in this case will contain
# the myClass instance as the list of instances are passed in to be sorted

# With that sum's value as the key argument, the sorted([x for x...]) part of the statement 
# will return a list containing the class instances ordered by sum of the list's items

# Finally, the outermost list comprehension
[y.myName for y in sorted([x for x...], key=lambda...)]
# This simply take the already ordered list of class instances 
# (ordered by the sum of each instance's list items)
# and return the myName attribute

Python Sorting Part 1: Dictionary Keys

Sorting Python Dictionary Keys

Recently I ran into the need to perform more involved operations on Python dictionaries and sorting of ordered lists. This led me to a deeper understanding of a side of the sorted() function that I used to shy away from. 

The biggest area that I had to learn about was sorting values by a key -- how it works and the practical applications.

So the following lines of code are my 'journal' in this little learning journey through dictionary operations and sorting. 

# Here's a dictionary d that has strings as keys, and a list of integers for each key's value.
d = {'m': [23, 42, 63], 'm800': [32, 53, 743, 8], 'm23': [3324,425,21], 'a132': [2, 2, 53, 64]}

# Here's a dictionary e that has integers as keys, and a list of integers for each key's value.
e = {1: [42, 43, 52], 200: [3, 53, 63, 2], 60: [4, 62, 96], 30: [63, 89], 500: [32]}

print d
# Result: {'a132': [2, 2, 53, 64], 'm800': [32, 53, 743, 8], 'm': [23, 42, 63], 'm23': [3324, 425, 21]}

# The order of the dictionary that Python prints out for us reflects the order stored internally, 
# which is in no particular order. 
# Indeed Python documentation states that dictionaries are not ordered types.

# However, starting Python 3.6 and above, 
# the default dictionary type will maintain their order in which keys are created.

print e
# Result: {200: [3, 53, 63, 2], 1: [42, 43, 52], 60: [4, 62, 96], 500: [32], 30: [63, 89]}

d.keys()
# Result: ['a132', 'm800', 'm', 'm23'] #
# The result returns a list of keys in the order stored internally, same as print d above.

d.values()
# Result: [[2, 2, 53, 64], [32, 53, 743, 8], [23, 42, 63], [3324, 425, 21]] # 
# The result returns a list of the values stored respective to the order of the keys stored internally

# Finding the sum of all the elements in a list of d's keys
sum([len(d[i]) for i in d.keys()])
# Result: 14 #

# Finding the sum of all the elements in a list of e's keys
sum([len(e[i]) for i in e.keys()])
# Result: 13 #

# Sorts dictionary d by her keys (which are strings), in ascending alphabetical order
sorted(d.keys())
# Result: ['a132', 'm', 'm23', 'm800'] #
sorted(d) # yields the same result, but I feel this is less readable
# Result: ['a132', 'm', 'm23', 'm800'] #

# Sorts dictionary e by her keys (which are integers), in order of ascending numerical value
sorted(e.keys()) # sorts this dictionary which uses integers as keys, in ascending order
# Result: [1, 30, 60, 200, 500] #

# Using the reverse argument we can reverse the resulting list order of dictionary d keys
sorted(d, reverse=True) # returns d's keys in decending order
# Result: ['m800', 'm23', 'm', 'a132'] #

# Using the reverse argument to reverse the resulting list order of dictionary e keys
sorted(e, reverse=True)
# Result: [500, 200, 60, 30, 1] # 

When sorted()is used with the key argument, we can sort according to the result of a function / operation / method
sorted(d.keys(), key=str) # sort using  themselves (string type) as the ordering crieteria
# Result: ['a132', 'm', 'm23', 'm800'] # still in alphabetical order

# We can also sort dictionary e by the string representation of her integer keys
sorted(e.keys(), key=str)
# Result: [1, 200, 30, 500, 60] #
# see how the numbers are ordered by value of the first character, 
# regardless of actual integer value? (200 came before 30, 500 came before 60)

# now we want to sort the keys of d by string length instead
# for this, we use the special method __len__ of the string class which the keys belong to
sorted(d.keys(), key=str.__len__) # use the special method __len__ of the string class 
# Result: ['m', 'm23', 'a132', 'm800'] # arranged from shortest to longest string length

# We can achieve the same result by using the len() function
sorted(d.keys(), key=len) # arranged from shortest to longest string length
# Result: ['m', 'm23', 'a132', 'm800'] #

# dictionary e has integers as her keys, 
# int objects do not have the __len__ in their special methods
sorted(e.keys(), key=str.__len__) # this will cause an error
# Error: descriptor '__len__' requires a 'str' object but received a 'int'

# however we can pass in a function to the key argument. our function takes an input
def getStrLength(myInput):
    return len(str(myInput))
sorted(e.keys(), key=getStrLength)
# Result: [1, 60, 30, 200, 500] # correctly sorted in order of length
# note that 60 comes before 30, that is because 60 and 30 are both of length 2,
# it makes no difference which one comes first

# a one-line function like this is an excellent candidate for the lambda function
sorted(e.keys(), key=lambda(myInput): len(str(myInput)))
# Result: [1, 60, 30, 200, 500] #
# with lambda we do away from the need to create a named function just for this

# here we have the dictionary f, which has float values as her keys
f = {'m1': [23, 42, 63], 'm2': [32, 53, 743, 8], 'm23': [3324,425,21],
    'a132': [2, 2, 53, 64], 2001: [2, 5, 6, 32], 32: [24, 25], 4: [50],
    2.64: [2, 53, 6], 78.12526: [5, 60, 22]
    }
sorted( f.keys(), key=lambda x: len(str(x)) )
# Result: [4, 32, 'm1', 'm2', 'm23', 2001, 'a132', 2.64, 78.12526] #
# because our lambda function takes any object type passed in and converted it into a string,
# it does not encounter an error, even with keys that are integers or float numbers

# sorting the keys in ascending order of the number of items of the list in each key
sorted(f.keys(), key=lambda(myKey):len(f[myKey]))
# Result: [4, 32, 'm1', 78.12526, 2.64, 'm23', 2001, 'm2', 'a132'] #
# in the first returned key, f[4] has only 1 item [50]
# in the last item f['a132'] has 4 items [2, 2, 53, 64]
# thus the result is correctly ordered

# sorting the keys in ascending order of the sum of all items in each key's list
sorted(f.keys(), key=lambda(myKey):sum(f[myKey]))
# Result: [2001, 32, 4, 2.64, 78.12526, 'a132', 'm1', 'm2', 'm23'] #
# in the first returned key, the sum of f[2001] items is 45
# in the last returned key, the sum of f['m23'] items is 3770
# thus the result is correctly ordered