What is a Hashtable/Hashmap?

A hashtable is a data structure that with a collection of key-value pairs, where each key maps to a value, and the keys must be unique and hashable.

  • In Python there is a built in hashtable known as a dictionary.

The primary purpose of a hashtable is to provide efficient lookup, insertion, and deletion operations. When an element is to be inserted into the hashtable, a hash function is used to map the key to a specific index in the underlying array that is used to store the key-value pairs. The value is then stored at that index. When searching for a value, the hash function is used again to find the index where the value is stored.

The key advantage of a hashtable over other data structures like arrays and linked lists is its average-case time complexity for lookup, insertion, and deletion operations.

  • The typical time complexity of a hashtable is O(1).

What is Hashing and Collision?

Hashing is the process of mapping a given key to a value in a hash table or hashmap, using a hash function. The hash function takes the key as input and produces a hash value or hash code, which is then used to determine the index in the underlying array where the value is stored. The purpose of hashing is to provide a quick and efficient way to access data, by eliminating the need to search through an entire data structure to find a value.

However, it is possible for two different keys to map to the same hash value, resulting in a collision. When a collision occurs, there are different ways to resolve it, depending on the collision resolution strategy used.

Python's dictionary implementation is optimized to handle collisions efficiently, and the performance of the dictionary is generally very good, even in the presence of collisions. However, if the number of collisions is very high, the performance of the dictionary can degrade, so it is important to choose a good hash function that minimizes collisions when designing a Python dictionary.

What is a Set?

my_set = set([1, 2, 3, 2, 1, 'b', '1'])
print(my_set)  

# What do you notice in the output?
#
# The output only contains unique characters. For example, if 1 appears in the set twice, it will only get printed once. Also if there is 1 and "1" in the set, both will get printed because the first 1 is a integer while the second "1" is a string.

# Why do you think Sets are in the same tech talk as Hashmaps/Hashtables?
#
# Dictionaries cannot have duplicate keys either.
{1, 2, 3, '1', 'b'}

Dictionary Example

Below are just some basic features of a dictionary. As always, documentation is always the main source for all the full capablilties.

lover_album = {
    "title": "Lover",
    "artist": "Taylor Swift",
    "year": 2019,
    "genre": ["Pop", "Synth-pop"],
    "tracks": {
        1: "I Forgot That You Existed",
        2: "Cruel Summer",
        3: "Lover",
        4: "The Man",
        5: "The Archer",
        6: "I Think He Knows",
        7: "Miss Americana & The Heartbreak Prince",
        8: "Paper Rings",
        9: "Cornelia Street",
        10: "Death By A Thousand Cuts",
        11: "London Boy",
        12: "Soon You'll Get Better (feat. Dixie Chicks)",
        13: "False God",
        14: "You Need To Calm Down",
        15: "Afterglow",
        16: "Me! (feat. Brendon Urie of Panic! At The Disco)",
        17: "It's Nice To Have A Friend",
        18: "Daylight"
    }
}

# What data structures do you see?
#
# Strings, Integers, Lists, Dictionaries

# Printing the dictionary
print(lover_album)
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}}
print(lover_album.get('tracks'))
# or
print(lover_album['tracks'])
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}
{1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight'}
print(lover_album.get('tracks')[4])
# or
print(lover_album['tracks'][4])
The Man
The Man
lover_album["producer"] = set(['Taylor Swift', 'Jack Antonoff', 'Joel Little', 'Taylor Swift', 'Louis Bell', 'Frank Dukes'])

# What can you change to make sure there are no duplicate producers?
#
# you can add set() around the value to remove duplicate producers

# Printing the dictionary
print(lover_album)
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop', 'electropop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before'}, 'producer': {'Louis Bell', 'Frank Dukes', 'Jack Antonoff', 'Joel Little', 'Taylor Swift'}}
lover_album["tracks"].update({19: "All Of The Girls You Loved Before"})

# How would add an additional genre to the dictionary, like electropop? 
# 
lover_album["genre"].append("electropop")

# Printing the dictionary
print(lover_album)
{'title': 'Lover', 'artist': 'Taylor Swift', 'year': 2019, 'genre': ['Pop', 'Synth-pop', 'electropop'], 'tracks': {1: 'I Forgot That You Existed', 2: 'Cruel Summer', 3: 'Lover', 4: 'The Man', 5: 'The Archer', 6: 'I Think He Knows', 7: 'Miss Americana & The Heartbreak Prince', 8: 'Paper Rings', 9: 'Cornelia Street', 10: 'Death By A Thousand Cuts', 11: 'London Boy', 12: "Soon You'll Get Better (feat. Dixie Chicks)", 13: 'False God', 14: 'You Need To Calm Down', 15: 'Afterglow', 16: 'Me! (feat. Brendon Urie of Panic! At The Disco)', 17: "It's Nice To Have A Friend", 18: 'Daylight', 19: 'All Of The Girls You Loved Before', 20: 'test'}, 'producer': {'Louis Bell', 'Frank Dukes', 'Jack Antonoff', 'Joel Little', 'Taylor Swift'}}
# for k,v in lover_album.items(): # iterate using a for loop for key and value
#     if k == "tracks":
#         print("tracks:")
#     print(str(k) + ": " + str(v))

# Write your own code to print tracks in readable format
for k,v in lover_album["tracks"].items():
    print(k , ":" , v)
1 : I Forgot That You Existed
2 : Cruel Summer
3 : Lover
4 : The Man
5 : The Archer
6 : I Think He Knows
7 : Miss Americana & The Heartbreak Prince
8 : Paper Rings
9 : Cornelia Street
10 : Death By A Thousand Cuts
11 : London Boy
12 : Soon You'll Get Better (feat. Dixie Chicks)
13 : False God
14 : You Need To Calm Down
15 : Afterglow
16 : Me! (feat. Brendon Urie of Panic! At The Disco)
17 : It's Nice To Have A Friend
18 : Daylight
19 : All Of The Girls You Loved Before
# def search():
#     search = input("What would you like to know about the album?")
#     if lover_album.get(search.lower()) == None:
#         print("Invalid Search")
#     else:
#         print(lover_album.get(search.lower()))

# search()

# This is a very basic code segment, how can you improve upon this code?
def search_v2():
    search = input("What would you like to know about the album? Type 'title' to get the title of the album. Type 'artist' to get the name of the artist. Type 'year' to get the album release date. Type 'genre' to get the genre of the album. Type 'tracks' to get the list of songs on the album. Type 'tracks number' to get a specific song from the album. Type 'producer' to get a list of producers on the album.")
    if search == "tracks number":
         song = input("Give a number for witch song you want from the album.")
         print(lover_album['tracks'][int(song)])

    elif search == "add track":
        new_track = input("Give a new track to add to the album.")
        lover_album["tracks"].update({len(lover_album["tracks"])+1: new_track})
        print(new_track, "was added to the album at position", len(lover_album["tracks"]))
        for k,v in lover_album["tracks"].items():
                    print(k, ":", v)
    
    else:
        if lover_album.get(search.lower()) == None:
            print("Invalid Search")

        elif search == "title":
            print("The name of this album is", lover_album.get("title"))

        elif search == "artist":
            print("The artist who created this album is", lover_album.get("artist"))

        elif search == "year":
            print("The year this album was released was", lover_album.get("year"))

        elif search == "genre":
            print("The genres of this album are", ', '.join(lover_album.get("genre")))

        elif search == "tracks":
                print("Here are the tracks from the album:")
                for k,v in lover_album["tracks"].items():
                    print(k, ":", v)
        
        elif search == "producer":
            print("The people who produced this album are", ', '.join(lover_album.get("producer")))
        
        search_v2()

search_v2()
test song was added to the album at position 21
1 : I Forgot That You Existed
2 : Cruel Summer
3 : Lover
4 : The Man
5 : The Archer
6 : I Think He Knows
7 : Miss Americana & The Heartbreak Prince
8 : Paper Rings
9 : Cornelia Street
10 : Death By A Thousand Cuts
11 : London Boy
12 : Soon You'll Get Better (feat. Dixie Chicks)
13 : False God
14 : You Need To Calm Down
15 : Afterglow
16 : Me! (feat. Brendon Urie of Panic! At The Disco)
17 : It's Nice To Have A Friend
18 : Daylight
19 : All Of The Girls You Loved Before
20 : test
21 : test song

Hacks

  • Answer ALL questions in the code segments
  • Create a diagram or comparison illustration (Canva).
    • What are the pro and cons of using this data structure?
Pros Cons
Allows insertion of key value pair. Allows insertion of key value pair.
HashMap is non synchronized. HashMap cannot be shared between multiple threads without proper synchronization.
HashMap is a fail-fast iterator. HashMap is a fail-fast iterator.
Faster access of elements due to hashing technology. HashMap is a fail-fast iterator.
Lists Dictionaries
Can work with any data Dictionaries use key, value pairs
Index starts at 0 Keys can't be changed
Can append, delete, etc Dictionaries don't have to be ordered
Lists use [] Dictionaries use {}
  • Expand upon the code given to you, possible improvements in comments
  • Build your own album showing features of a python dictionary

  • For Mr. Yeung's class: Justify your favorite Taylor Swift song, answer may effect seed

heros_and_villains_album = {
    "title": "Heros and Villains",
    "artist": "Metro Boomin",
    "year": 2022,
    "genre": ["Hip-Hop", "Rap"],
    "producers": ["Metro Boomin", "TM88", "Johan Lenox", "DaHeala", "Allen Ritter", "Honorable C.N.O.T.E.", "Oz"],
    "tracks": {
        1: "Trance (feat. Young Thug, Travis Scott)",
        2: "Superhero (feat. Chris Brown, Future)",
        3: "Feel The Fiyaaaah (feat. Takeoff, A$AP Rocky)",
        4: "I Can't Save You (feat. Future, Don Toliver)",
        5: "Metro Spider (feat. Young Thug)",
        6: "Niagara Falls (feat. 21 Savage, Travis Scott)",
        7: "Lock On Me (feat. Future, Travis Scott)",
        8: "Umbrella (feat. Young Nudy, 21 Savage)",
        9: "Creepin' (feat. 21 Savage, The Weeknd)",
        10: "Walk Em Down (feat. Mustafa the Poet, 21 Savage)",
        11: "Raindrops (feat. Future, Don Toliver)",
        12: "All the Money (feat. Gunna)",
        13: "On Time (feat. John Legend)",
        14: "Around Me (feat. Don Toliver)"
    },
}

def search_v3():
    # asks for what details you would like to know
    search = input("What would you like to know about the album? Type 'title' to get the title of the album. Type 'artist' to get the name of the artist. Type 'year' to get the album release date. Type 'genre' to get the genre of the album. Type 'tracks' to get the list of songs on the album. Type 'tracks number' to get a specific song from the album. Type 'producers' to get a list of producers on the album.")
    # if the input is blank, it stops the code
    if search != "":

        if search == "tracks number":
            # asks for an input
            song = input("Give a number for which song you want from the album.")
            # if input is valid, prints the song name at that position. If it isn't valid, it exits and re-runs the function.
            if song <= len(heros_and_villains_album["tracks"]):
                print("The song at position", song, "is", heros_and_villains_album['tracks'][int(song)])
            else:
                search_v3()

        elif search == "add track":
            # asks for an input
            new_track = input("Give a new track to add to the album.")
            # adds new track to the album
            heros_and_villains_album["tracks"].update({len(heros_and_villains_album["tracks"])+1: new_track})
            print(new_track, "was added to the album at position", len(heros_and_villains_album["tracks"]))
        
        else:
            # checks if the input is not blank, but not valid either.
            if heros_and_villains_album.get(search.lower()) == None:
                print("Invalid Search")

            # if the user is asking for the title, it prints it
            elif search == "title":
                print("The name of this album is", heros_and_villains_album.get("title"))

            # if the user is asking for the artist, it prints it
            elif search == "artist":
                print("The artist who created this album is", heros_and_villains_album.get("artist"))

            # if the user is asking for the year the album was released, it prints it
            elif search == "year":
                print("The year this album was released was", heros_and_villains_album.get("year"))

            # if the user is asking for the genres of the album, it prints it
            elif search == "genre":
                print("The genres of this album are", ', '.join(heros_and_villains_album.get("genre")))

            # if the user is asking for the tracks in the album, it prints it
            elif search == "tracks":
                    print("Here are the tracks from the album:")
                    # uses a for loop to print the keys (song position) and the values (song name)
                    for k,v in heros_and_villains_album["tracks"].items():
                        print(k, ":", v)
            
            # if the user is asking for the producers, it prints it
            elif search == "producers":
                print("The people who produced this album are", ', '.join(heros_and_villains_album.get("producers")))

        # repeats the function  
        search_v3()

search_v3()
The name of this album is Heros and Villains
The artist who created this album is Metro Boomin
The year this album was released was 2022
The genres of this album are Hip-Hop, Rap
Here are the tracks from the album:
1 : Trance (feat. Young Thug, Travis Scott)
2 : Superhero (feat. Chris Brown, Future)
3 : Feel The Fiyaaaah (feat. Takeoff, A$AP Rocky)
4 : I Can't Save You (feat. Future, Don Toliver)
5 : Metro Spider (feat. Young Thug)
6 : Niagara Falls (feat. 21 Savage, Travis Scott)
7 : Lock On Me (feat. Future, Travis Scott)
8 : Umbrella (feat. Young Nudy, 21 Savage)
9 : Creepin' (feat. 21 Savage, The Weeknd)
10 : Walk Em Down (feat. Mustafa the Poet, 21 Savage)
11 : Raindrops (feat. Future, Don Toliver)
12 : All the Money (feat. Gunna)
13 : On Time (feat. John Legend)
14 : Around Me (feat. Don Toliver)
The people who produced this album are Metro Boomin, TM88, Johan Lenox, DaHeala, Allen Ritter, Honorable C.N.O.T.E., Oz
Epic Song (feat. Soham Kamat) was added to the album at position 15
The song at position 15 is Epic Song (feat. Soham Kamat)

My favorite Taylor Swift song is "You Belong With Me" because of the catchy beat and amazing lyrics. My favorite part is

She wears high heels

I wear sneakers

She's Cheer Captain, and I'm on the bleachers

Dreaming about the day when you wake up and find

That what you're looking for has been here the whole time

I love this song and I listen to it all the time.