Beginner

Python Data Structures

Master lists, tuples, dictionaries, and sets

Imagine you're organizing a party. You need a guest list (that can change), a fixed menu that can't be altered, a collection of unique music genres to play, and a way to map each guest to their food preference. Trying to do all this with simple variables would be a nightmare! This is exactly why Python gives you powerful data structures—each designed for specific tasks. Store many items, remove duplicates, map keys to values—data structures do the heavy lifting so you can focus on solving problems.

What Are Data Structures?

A data structure is a way of organizing and storing data so you can access and modify it efficiently. Python provides four essential built-in data structures, each with unique characteristics:

What is a List?

A list is an ordered, mutable collection of items. "Ordered" means the items maintain their position (first item stays first), and "mutable" means you can change, add, or remove items after creating the list. Lists are perfect when you need a collection that can grow, shrink, or change over time.

What is a Tuple?

A tuple is an ordered, immutable collection. Like lists, tuples maintain order, but unlike lists, you cannot modify them after creation. Think of tuples as "locked" or "frozen" lists. They're ideal for storing data that should never change, like coordinates or configuration settings.

What is a Set?

A set is an unordered collection of unique elements. Sets automatically remove duplicates and don't maintain any particular order. They're perfect for membership testing (checking if something exists) and eliminating duplicates from a collection.

What is a Dictionary?

A dictionary is an unordered collection of key-value pairs. It maps unique keys to their corresponding values, like a real dictionary maps words to definitions. Dictionaries provide lightning-fast lookups and are ideal when you need to associate related pieces of information.

What is List Comprehension?

A list comprehension is a concise, Pythonic way to create lists. Instead of writing multiple lines with a loop, you write a single elegant line that builds your list. It's not a data structure itself, but a powerful tool for creating lists efficiently.

💡 Quick Comparison: Lists and tuples maintain order; sets and dictionaries don't. Lists and dictionaries are mutable; tuples and sets (individual elements) are immutable. Use lists for changing sequences, tuples for fixed data, sets for uniqueness, and dictionaries for key-value relationships.

Understanding Through Analogies

Let's relate each data structure to something familiar:

Visual Mental Model

Here's how to visualize these structures in memory:

# LIST: Ordered boxes you can open and change
list: [0] → a    [1] → b    [2] → c    [3] → d
      ↓          ↓          ↓          ↓
   Can add, remove, or change any box

# TUPLE: Sealed boxes in a fixed order
tuple: (0) → a    (1) → b    (2) → c
       🔒        🔒        🔒
    Cannot modify after creation

# SET: Unordered unique items in a bag
set: { a, b, c }  ← No duplicates, no guaranteed order
     
# DICTIONARY: Key-value pairs like a filing cabinet
dict: {"name" → "Alice", "age" → 25, "city" → "Toronto"}
       ↑           ↑        ↑       ↑      ↑         ↑
      Key       Value     Key    Value   Key      Value

Practical Examples: From Simple to Advanced

Example 1: Basic List Operations

# Creating and modifying a list
fruits = ["apple", "banana", "cherry"]

# Adding items
fruits.append("date")           # Add to end: ["apple", "banana", "cherry", "date"]
fruits.insert(1, "blueberry")   # Insert at index 1: ["apple", "blueberry", "banana", "cherry", "date"]

# Removing items
fruits.remove("banana")         # Remove by value: ["apple", "blueberry", "cherry", "date"]
last_fruit = fruits.pop()       # Remove and return last item: "date"

# Accessing items
print(fruits[0])                # First item: "apple"
print(fruits[-1])               # Last item: "cherry"

# Output:
# apple
# cherry

Example 2: List Comprehensions (The Pythonic Way)

# Traditional way with loop
squares_old = []
for n in range(1, 6):
    squares_old.append(n * n)
print(squares_old)  # [1, 4, 9, 16, 25]

# Pythonic way with list comprehension (ONE LINE!)
squares = [n * n for n in range(1, 6)]
print(squares)      # [1, 4, 9, 16, 25]

# Comprehension with condition (filter even squares)
even_squares = [n * n for n in range(1, 11) if (n * n) % 2 == 0]
print(even_squares) # [4, 16, 36, 64, 100]

# Practical example: Convert temperatures from Celsius to Fahrenheit
celsius = [0, 10, 20, 30, 40]
fahrenheit = [(temp * 9/5) + 32 for temp in celsius]
print(fahrenheit)   # [32.0, 50.0, 68.0, 86.0, 104.0]

Example 3: Tuples for Immutable Data

# Coordinates that shouldn't change
point = (10, 20)
print(f"X: {point[0]}, Y: {point[1]}")  # X: 10, Y: 20

# Tuple unpacking (elegant way to extract values)
x, y = point
print(f"X: {x}, Y: {y}")  # X: 10, Y: 20

# Returning multiple values from a function
def get_user_info():
    return ("Alice", 25, "Toronto")  # Returns a tuple

name, age, city = get_user_info()
print(f"{name} is {age} years old from {city}")
# Output: Alice is 25 years old from Toronto

# Why tuples? Try to modify (this will fail)
# point[0] = 15  # ❌ TypeError: 'tuple' object does not support item assignment

Example 4: Sets for Uniqueness and Fast Lookups

# Sets automatically remove duplicates
tags = {"ai", "python", "ml", "ai", "python"}
print(tags)  # {"ai", "python", "ml"} - duplicates removed!

# Adding and removing items
tags.add("data-science")     # Add new item
tags.discard("ml")           # Remove item (no error if not found)
print(tags)  # {"ai", "python", "data-science"}

# Fast membership testing (very efficient for large sets)
print("ai" in tags)          # True - instant lookup!
print("java" in tags)        # False

# Set operations (like math!)
skills_alice = {"python", "sql", "excel"}
skills_bob = {"python", "java", "excel"}

common_skills = skills_alice & skills_bob        # Intersection: {"python", "excel"}
all_skills = skills_alice | skills_bob           # Union: {"python", "sql", "excel", "java"}
alice_only = skills_alice - skills_bob           # Difference: {"sql"}

print(f"Common skills: {common_skills}")
print(f"Alice's unique skills: {alice_only}")

Example 5: Dictionaries for Key-Value Mappings

# Creating a student profile
student = {
    "name": "Aisha",
    "age": 20,
    "major": "Computer Science",
    "gpa": 3.8
}

# Accessing values
print(student["name"])        # "Aisha"
print(student.get("age"))     # 20 (safer way)
print(student.get("email", "Not provided"))  # "Not provided" (default if key missing)

# Adding and modifying
student["email"] = "aisha@university.edu"  # Add new key-value pair
student["gpa"] = 3.9                       # Update existing value

# Looping through dictionary
for key, value in student.items():
    print(f"{key}: {value}")

# Practical example: Grade calculator
grades = {"math": 85, "english": 92, "science": 88, "history": 90}
total = sum(grades.values())
average = total / len(grades)
print(f"Average grade: {average:.1f}")  # Average grade: 88.8

Data Structure Comparison Table

Feature List Tuple Set Dictionary
Syntax [] () {} {key: value}
Ordered? ✅ Yes ✅ Yes ❌ No ✅ Yes (Python 3.7+)
Mutable? ✅ Yes ❌ No ✅ Yes ✅ Yes
Duplicates? ✅ Allowed ✅ Allowed ❌ No Keys: No, Values: Yes
Best For Changing sequences Fixed data Unique items Key-value pairs

Hands-On Practice: Step-by-Step Exercises

Exercise 1: Remove Duplicate Names from a List

Step 1: Create a new Python file called duplicate_remover.py

Step 2: Write the starter code

# duplicate_remover.py

# List of student names with duplicates
students = ["Alice", "Bob", "Charlie", "Alice", "David", "Bob", "Eve", "Alice"]

print("Original list:", students)
print(f"Number of names: {len(students)}")

# Your task: Convert to set to remove duplicates, then back to list
unique_students = list(set(students))

print("\nUnique students:", unique_students)
print(f"Number of unique names: {len(unique_students)}")

Step 3: Run the program in your terminal

python duplicate_remover.py

Step 4: Expected output

Original list: ['Alice', 'Bob', 'Charlie', 'Alice', 'David', 'Bob', 'Eve', 'Alice']
Number of names: 8

Unique students: ['Charlie', 'Eve', 'Alice', 'Bob', 'David']
Number of unique names: 5

⚠️ Note: The order of unique students may vary because sets are unordered. If you need to preserve the original order, use a different approach (we'll see this in common mistakes).

Exercise 2: Build a Course Grade Dictionary

Step 1: Create a file called grade_book.py

Step 2: Build the grade mapping system

# grade_book.py

# Create an empty dictionary for grades
grades = {}

# Add courses and grades
grades["Math"] = 85
grades["English"] = 92
grades["Science"] = 88
grades["History"] = 90
grades["Art"] = 95

# Display all grades
print("=== Grade Book ===")
for course, grade in grades.items():
    print(f"{course}: {grade}%")

# Calculate statistics
average = sum(grades.values()) / len(grades)
highest = max(grades.values())
lowest = min(grades.values())

print(f"\nAverage: {average:.1f}%")
print(f"Highest: {highest}%")
print(f"Lowest: {lowest}%")

# Find which course has the highest grade
best_course = max(grades, key=grades.get)
print(f"Best performance: {best_course}")

Step 3: Run and verify

python grade_book.py

Step 4: Expected output

=== Grade Book ===
Math: 85%
English: 92%
Science: 88%
History: 90%
Art: 95%

Average: 90.0%
Highest: 95%
Lowest: 85%
Best performance: Art

Common Mistakes and How to Fix Them

Mistake 1: Using Lists When Dictionaries Are Needed

Problem: Storing related data in parallel lists makes code confusing and error-prone.

# ❌ BAD: Parallel lists (hard to maintain)
names = ["Alice", "Bob", "Charlie"]
ages = [25, 30, 28]
cities = ["Toronto", "Vancouver", "Montreal"]

# What if you add a name but forget to add age/city? Data becomes misaligned!

Solution: Use a dictionary or list of dictionaries

# ✅ GOOD: Dictionary keeps related data together
people = {
    "Alice": {"age": 25, "city": "Toronto"},
    "Bob": {"age": 30, "city": "Vancouver"},
    "Charlie": {"age": 28, "city": "Montreal"}
}

print(people["Alice"]["age"])  # 25

Mistake 2: Expecting Sets to Maintain Order

Problem: Assuming set elements will stay in the order you added them.

# ❌ Don't rely on set order
numbers = {3, 1, 4, 1, 5, 9, 2, 6}
print(numbers)  # Order is unpredictable!

Solution: If order matters, use a list or convert to sorted list

# ✅ GOOD: Sort when needed
numbers = {3, 1, 4, 1, 5, 9, 2, 6}
sorted_numbers = sorted(numbers)
print(sorted_numbers)  # [1, 2, 3, 4, 5, 6, 9]

Mistake 3: Trying to Modify Tuples

Problem: Attempting to change tuple contents after creation.

# ❌ This will crash!
coordinates = (10, 20)
coordinates[0] = 15  # TypeError: 'tuple' object does not support item assignment

Solution: Create a new tuple instead

# ✅ GOOD: Create new tuple with updated values
coordinates = (10, 20)
coordinates = (15, coordinates[1])  # New tuple: (15, 20)

# Or convert to list, modify, then back to tuple
temp_list = list(coordinates)
temp_list[0] = 15
coordinates = tuple(temp_list)

Mistake 4: Forgetting Dictionary Keys Are Case-Sensitive

Problem: Accessing dictionary with wrong case returns KeyError.

# ❌ Case mismatch causes error
user = {"Name": "Alice", "Age": 25}
print(user["name"])  # KeyError: 'name' (should be "Name")

Solution: Use consistent casing and .get() for safety

# ✅ GOOD: Use .get() with default value
user = {"Name": "Alice", "Age": 25}
print(user.get("name", "Not found"))  # "Not found"
print(user.get("Name"))                # "Alice"

# Or normalize keys to lowercase
user = {key.lower(): value for key, value in user.items()}
print(user["name"])  # Works now!

Mistake 5: Modifying List While Looping

Problem: Removing items from a list while iterating causes skipped elements.

# ❌ BAD: Modifying list during iteration
numbers = [1, 2, 3, 4, 5]
for num in numbers:
    if num % 2 == 0:
        numbers.remove(num)  # Causes unexpected behavior!
print(numbers)  # [1, 3, 5] - but may skip elements

Solution: Use list comprehension or iterate over a copy

# ✅ GOOD: List comprehension
numbers = [1, 2, 3, 4, 5]
numbers = [num for num in numbers if num % 2 != 0]
print(numbers)  # [1, 3, 5]

# Or iterate over a copy
numbers = [1, 2, 3, 4, 5]
for num in numbers[:]:  # [:] creates a copy
    if num % 2 == 0:
        numbers.remove(num)

Mistake 6: Using Mutable Default Arguments in Functions

Problem: Using a list as a default parameter causes unexpected sharing between function calls.

# ❌ BAD: Mutable default argument
def add_item(item, items=[]):
    items.append(item)
    return items

list1 = add_item("apple")
list2 = add_item("banana")
print(list2)  # ['apple', 'banana'] - Wait, what?!

Solution: Use None as default and create new list inside function

# ✅ GOOD: Use None as default
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

list1 = add_item("apple")
list2 = add_item("banana")
print(list2)  # ['banana'] - Perfect!

Mini-Project: Word Frequency Counter

Project Goal: Build a program that analyzes text and counts how many times each word appears.

Requirements

Step-by-Step Implementation

Step 1: Create word_counter.py

Step 2: Build the basic counter

# word_counter.py

print("=== Word Frequency Counter ===")
print("Enter a sentence or paragraph:")

# Get user input
text = input("Text: ").lower()

# Split into words
words = text.split()

# Count frequencies using dictionary
word_counts = {}
for word in words:
    # If word exists, increment count; otherwise, start at 1
    word_counts[word] = word_counts.get(word, 0) + 1

# Display results
print("\n=== Word Frequencies ===")
for word, count in word_counts.items():
    print(f"{word}: {count}")

# Show statistics
print(f"\nTotal words: {len(words)}")
print(f"Unique words: {len(word_counts)}")

Step 3: Test with sample input

python word_counter.py

Sample Input:

Python is amazing and Python is powerful. Python makes programming fun and Python is easy to learn.

Expected Output:

=== Word Frequencies ===
python: 4
is: 4
amazing: 1
and: 2
powerful.: 1
makes: 1
programming: 1
fun: 1
easy: 1
to: 1
learn.: 1

Total words: 17
Unique words: 11

Bonus Challenges

Challenge 1: Remove punctuation before counting

import string

# Remove punctuation
text = text.translate(str.maketrans('', '', string.punctuation))

Challenge 2: Display words sorted by frequency (most common first)

# Sort by count (descending)
sorted_words = sorted(word_counts.items(), key=lambda x: x[1], reverse=True)

print("\n=== Most Common Words ===")
for word, count in sorted_words[:5]:  # Top 5
    print(f"{word}: {count}")

Challenge 3: Create a bar chart visualization using text

# Display bar chart
print("\n=== Word Frequency Bar Chart ===")
for word, count in sorted_words:
    bar = "█" * count
    print(f"{word:15} | {bar} ({count})")

💡 Advanced Tip: Python's collections.Counter class does this automatically! Try: from collections import Counter and Counter(words). But understanding how to build it yourself with dictionaries is crucial!

Summary: Key Takeaways

What You've Learned

What Makes Each Structure Special?

Decision Guide: Which Structure Should You Use?

Ask yourself:

🚀 Practice Makes Perfect - Coding Platforms

Master data structures through hands-on practice with these interactive platforms

💪

LeetCode - Problem Solving Practice

Price: Free (Basic) | $35/month (Premium) | Perfect for: Building problem-solving skills & interview prep

Solve thousands of Python problems organized by topic. Focus on data structure challenges to solidify your understanding through real algorithmic problems.

Why developers love LeetCode:

  • 2,500+ problems from easy to expert level
  • Detailed solutions and discussion forums
  • Run and test your code instantly
  • Track progress and compare with other learners
Start Practicing Free →
🏆

HackerRank - Structured Python Path

Price: Free | Perfect for: Gamified learning with skill certifications

Follow guided Python learning paths with badges and certificates. Great for beginners who want structured practice with immediate feedback and achievement tracking.

What makes HackerRank great:

  • Beginner-friendly Python track (start easy, build up)
  • Earn verified certificates to showcase on LinkedIn
  • Companies use HackerRank for hiring (build your profile)
  • 100% free for learners—no premium required
Start Free Practice →

💡 Practice platforms help you apply what you've learned. Spend 15-30 min daily solving problems to build muscle memory!

Your Next Steps

Now that you've mastered Python's data structures, you're ready to organize them into reusable blocks of code. Here's what to practice:

🎯 Pro Tip: The best programmers don't memorize syntax—they understand when to use each tool. You now have four powerful tools in your toolkit. The real skill is choosing the right one for each job. Keep practicing, and soon it'll become second nature!

Remember: Data structures are the foundation of all programs. Master them, and you'll write cleaner, faster, and more elegant code. You're not just learning Python—you're learning to think like a programmer. Keep going! 🚀

🎯 Test Your Knowledge: Data Structures

Check your understanding with this quick quiz

1. Which data structure should you use to store unique email addresses from user signups?

List - because you need to maintain order
Set - because it automatically prevents duplicates
Tuple - because emails shouldn't change
Dictionary - because you need key-value pairs

2. What will this code output? nums = [x**2 for x in range(4)]

[1, 2, 3, 4]
[0, 1, 4, 9]
[0, 2, 4, 6]
[1, 4, 9, 16]

3. Why would you choose a tuple over a list for storing GPS coordinates?

Tuples are faster to create
Coordinates shouldn't change once set, and tuples prevent accidental modification
Tuples use less memory than lists
You can't use lists for numeric data

4. What's the best way to count how many times each word appears in a text?

Use a list and count manually with loops
Use a dictionary where keys are words and values are counts
Use a set to store unique words
Use a tuple to keep words in order

5. What happens when you try to add a duplicate value to a set?

Python raises an error
The duplicate is added but marked as duplicate
The duplicate is silently ignored; the set remains unchanged
The set is converted to a list
← Previous: Loops Next: Functions →