+1 tag for Google+

Tuesday, 13 December 2016

Python Sorting Part 1: Dictionary Keys

Sorting Python Dictionary Keys

Recently I ran into the need to perform more involved operations on Python dictionaries and sorting of ordered lists. This led me to a deeper understanding of a side of the sorted() function that I used to shy away from. 

The biggest area that I had to learn about was sorting values by a key -- how it works and the practical applications.

So the following lines of code are my 'journal' in this little learning journey through dictionary operations and sorting. 

# Here's a dictionary d that has strings as keys, and a list of integers for each key's value.
d = {'m': [23, 42, 63], 'm800': [32, 53, 743, 8], 'm23': [3324,425,21], 'a132': [2, 2, 53, 64]}

# Here's a dictionary e that has integers as keys, and a list of integers for each key's value.
e = {1: [42, 43, 52], 200: [3, 53, 63, 2], 60: [4, 62, 96], 30: [63, 89], 500: [32]}

print d
# Result: {'a132': [2, 2, 53, 64], 'm800': [32, 53, 743, 8], 'm': [23, 42, 63], 'm23': [3324, 425, 21]}

# The order of the dictionary that Python prints out for us reflects the order stored internally, 
# which is in no particular order. 
# Indeed Python documentation states that dictionaries are not ordered types.

# However, starting Python 3.6 and above, 
# the default dictionary type will maintain their order in which keys are created.

print e
# Result: {200: [3, 53, 63, 2], 1: [42, 43, 52], 60: [4, 62, 96], 500: [32], 30: [63, 89]}

d.keys()
# Result: ['a132', 'm800', 'm', 'm23'] #
# The result returns a list of keys in the order stored internally, same as print d above.

d.values()
# Result: [[2, 2, 53, 64], [32, 53, 743, 8], [23, 42, 63], [3324, 425, 21]] # 
# The result returns a list of the values stored respective to the order of the keys stored internally

# Finding the sum of all the elements in a list of d's keys
sum([len(d[i]) for i in d.keys()])
# Result: 14 #

# Finding the sum of all the elements in a list of e's keys
sum([len(e[i]) for i in e.keys()])
# Result: 13 #

# Sorts dictionary d by her keys (which are strings), in ascending alphabetical order
sorted(d.keys())
# Result: ['a132', 'm', 'm23', 'm800'] #
sorted(d) # yields the same result, but I feel this is less readable
# Result: ['a132', 'm', 'm23', 'm800'] #

# Sorts dictionary e by her keys (which are integers), in order of ascending numerical value
sorted(e.keys()) # sorts this dictionary which uses integers as keys, in ascending order
# Result: [1, 30, 60, 200, 500] #

# Using the reverse argument we can reverse the resulting list order of dictionary d keys
sorted(d, reverse=True) # returns d's keys in decending order
# Result: ['m800', 'm23', 'm', 'a132'] #

# Using the reverse argument to reverse the resulting list order of dictionary e keys
sorted(e, reverse=True)
# Result: [500, 200, 60, 30, 1] # 

When sorted()is used with the key argument, we can sort according to the result of a function / operation / method
sorted(d.keys(), key=str) # sort using  themselves (string type) as the ordering crieteria
# Result: ['a132', 'm', 'm23', 'm800'] # still in alphabetical order

# We can also sort dictionary e by the string representation of her integer keys
sorted(e.keys(), key=str)
# Result: [1, 200, 30, 500, 60] #
# see how the numbers are ordered by value of the first character, 
# regardless of actual integer value? (200 came before 30, 500 came before 60)

# now we want to sort the keys of d by string length instead
# for this, we use the special method __len__ of the string class which the keys belong to
sorted(d.keys(), key=str.__len__) # use the special method __len__ of the string class 
# Result: ['m', 'm23', 'a132', 'm800'] # arranged from shortest to longest string length

# We can achieve the same result by using the len() function
sorted(d.keys(), key=len) # arranged from shortest to longest string length
# Result: ['m', 'm23', 'a132', 'm800'] #

# dictionary e has integers as her keys, 
# int objects do not have the __len__ in their special methods
sorted(e.keys(), key=str.__len__) # this will cause an error
# Error: descriptor '__len__' requires a 'str' object but received a 'int'

# however we can pass in a function to the key argument. our function takes an input
def getStrLength(myInput):
    return len(str(myInput))
sorted(e.keys(), key=getStrLength)
# Result: [1, 60, 30, 200, 500] # correctly sorted in order of length
# note that 60 comes before 30, that is because 60 and 30 are both of length 2,
# it makes no difference which one comes first

# a one-line function like this is an excellent candidate for the lambda function
sorted(e.keys(), key=lambda(myInput): len(str(myInput)))
# Result: [1, 60, 30, 200, 500] #
# with lambda we do away from the need to create a named function just for this

# here we have the dictionary f, which has float values as her keys
f = {'m1': [23, 42, 63], 'm2': [32, 53, 743, 8], 'm23': [3324,425,21],
    'a132': [2, 2, 53, 64], 2001: [2, 5, 6, 32], 32: [24, 25], 4: [50],
    2.64: [2, 53, 6], 78.12526: [5, 60, 22]
    }
sorted( f.keys(), key=lambda x: len(str(x)) )
# Result: [4, 32, 'm1', 'm2', 'm23', 2001, 'a132', 2.64, 78.12526] #
# because our lambda function takes any object type passed in and converted it into a string,
# it does not encounter an error, even with keys that are integers or float numbers

# sorting the keys in ascending order of the number of items of the list in each key
sorted(f.keys(), key=lambda(myKey):len(f[myKey]))
# Result: [4, 32, 'm1', 78.12526, 2.64, 'm23', 2001, 'm2', 'a132'] #
# in the first returned key, f[4] has only 1 item [50]
# in the last item f['a132'] has 4 items [2, 2, 53, 64]
# thus the result is correctly ordered

# sorting the keys in ascending order of the sum of all items in each key's list
sorted(f.keys(), key=lambda(myKey):sum(f[myKey]))
# Result: [2001, 32, 4, 2.64, 78.12526, 'a132', 'm1', 'm2', 'm23'] #
# in the first returned key, the sum of f[2001] items is 45
# in the last returned key, the sum of f['m23'] items is 3770
# thus the result is correctly ordered


No comments:

Post a Comment