Sets in Python are collections that store unique, unordered elements. They’re particularly useful for handling tasks where you need to manage distinct items without duplicates, and they support mathematical operations like unions and intersections.
What is a Set?
A set is an unordered collection of unique elements. Unlike lists or tuples, sets don’t allow duplicate items, and they’re defined using curly braces `{}` or the `set()` function.
Basic Syntax
my_set = {1, 2, 3, 4} # Using curly braces another_set = set([1, 2, 3]) # Using the set() function
Empty Set
To create an empty set, you need to use `set()` since `{}` creates an empty dictionary in Python.
empty_set = set()
Key Properties of Sets
- - Unordered: Elements in a set have no specific order, so indexing and slicing are not allowed.
- - Unique: A set automatically removes duplicates from the collection.
- - Mutable: While sets are mutable (you can add or remove items), they can only contain immutable elements (like integers, strings, and tuples).
Examples on sets in python
Example 1: Removing Duplicate Data
Let’s say you have a list of email addresses and want to filter out duplicates. Sets are a quick and efficient way to do this.
emails = ["madhu@example.com", "babu@example.com", "madhu@example.com", "shri@example.com"] unique_emails = set(emails) print(unique_emails) Output: {'madhu@example.com', 'babu@example.com', 'shri@example.com'}
Since sets automatically remove duplicates, they’re perfect for situations where you need distinct values.
Common Set Operations
Python sets support a range of useful operations for working with unique collections of items.
Adding and Removing Elements
- Adding Elements:
You can use `add()` to add an element to a set.
my_set = {1, 2, 3} my_set.add(4) print(my_set) # Output: {1, 2, 3, 4}- Removing Elements:
You can remove items with `remove()` (raises an error if the item doesn’t exist) or `discard()` (does not raise an error if the item doesn’t exist).
my_set.remove(2) # Removes the number 2 from the set my_set.discard(5) # Safe to discard even if 5 isn’t in the set
Mathematical Set Operations
1. Union (`|`): Combines all unique elements from two sets.
set_a = {1, 2, 3} set_b = {3, 4, 5} union_set = set_a | set_b print(union_set) # Output: {1, 2, 3, 4, 5}
2. Intersection (`&`): Retrieves only the common elements between sets.
intersection_set = set_a & set_b print(intersection_set) # Output: {3}
3. Difference (`-`): Elements in the first set that aren’t in the second.
difference_set = set_a - set_b print(difference_set) # Output: {1, 2}
4. Symmetric Difference (`^`): Elements in either of the sets, but not both.
sym_diff_set = set_a ^ set_b print(sym_diff_set) # Output: {1, 2, 4, 5}
Example 2: Checking for Duplicates Across Data Sources
Imagine you’re aggregating user data from multiple sources. You can use sets to find and handle duplicate or unique user IDs.
source_a_ids = {101, 102, 103, 104} source_b_ids = {103, 104, 105, 106}
Find users unique to each source
unique_to_a = source_a_ids - source_b_ids unique_to_b = source_b_ids - source_a_ids
Find common users
common_users = source_a_ids & source_b_ids print("Unique to source A:", unique_to_a) # Output: {101, 102} print("Unique to source B:", unique_to_b) # Output: {105, 106} print("Common users:", common_users) # Output: {103, 104}
Using set operations, you can manage overlapping data more efficiently than if you were looping over lists.
Set Methods and Built-in Functions
Some useful methods for working with sets include:
Example 3: Tags for Articles or Products
Let’s say you’re categorizing blog articles by tags. Sets are perfect here because a tag should only appear once for a single article.
tags = {"python", "development", "programming"} new_tags = {"tutorial", "programming", "python"} Update tags to include only unique tags all_tags = tags | new_tags print(all_tags) # Output: {'python', 'development', 'tutorial', 'programming'}
When to use Sets in Python
Sets are particularly useful in the following scenarios:
Summary
- Sets are unordered, mutable collections of unique elements.
- Common applications include removing duplicates, managing unique data, and performing set operations like union and intersection.
- Real-world use cases: Deduplicating lists, comparing unique identifiers across datasets, managing tags/categories.
Sets can greatly simplify certain tasks, making them a valuable addition to any developer’s toolkit in Python.