Skip to main content

Data Types and Structures in Python

Python offers a variety of built-in data types and structures that make data manipulation and organization efficient. Mastering these data structures is essential for effective programming, especially in data science. In this article, we will cover key data types such as lists, tuples, dictionaries, sets, and pandas DataFrames, along with practical examples of how to add, remove, and change values.


1. Lists

A list is an ordered, mutable collection of items that can hold elements of different data types. Lists are created using square brackets [] and can be modified after their creation.

Creating a List

my_list = [1, 2, 3, 4, 5]
fruits = ["apple", "banana", "cherry"]
mixed_list = [1, "banana", 3.14, True]

Adding, Removing, and Modifying Values

  • Adding Items:

    fruits.append("orange")  # Adds "orange" to the end
    fruits.insert(1, "kiwi") # Inserts "kiwi" at index 1
  • Removing Items:

    fruits.remove("banana")  # Removes the first occurrence of "banana"
    last_fruit = fruits.pop() # Removes and returns the last item
  • Modifying Items:

    fruits[0] = "mango"  # Changes the first item to "mango"

Example:

fruits = ["apple", "banana", "cherry"]
fruits.append("orange")
fruits.insert(1, "kiwi")
fruits.remove("banana")
last_fruit = fruits.pop()
fruits[0] = "mango"
print(fruits) # Output: ['mango', 'kiwi', 'cherry']

2. Tuples

A tuple is similar to a list, but it is immutable, meaning that once it’s created, its contents cannot be changed. Tuples are created using parentheses ().

Creating a Tuple

my_tuple = (1, 2, 3)
fruits_tuple = ("apple", "banana", "cherry")
mixed_tuple = (1, "banana", 3.14, True)

Modifying Tuples

  • Adding Values: You cannot directly add items to a tuple, but you can concatenate tuples to form a new one.

    new_fruits_tuple = fruits_tuple + ("orange",)
  • Removing Values: Tuples do not support item removal, but you can create a new tuple without certain elements.

    updated_tuple = fruits_tuple[:1] + fruits_tuple[2:]  # Removes "banana"

Example:

fruits_tuple = ("apple", "banana", "cherry")
new_fruits_tuple = fruits_tuple + ("orange",)
updated_tuple = fruits_tuple[:1] + fruits_tuple[2:]
print(new_fruits_tuple) # Output: ('apple', 'banana', 'cherry', 'orange')
print(updated_tuple) # Output: ('apple', 'cherry')

3. Dictionaries

A dictionary is an unordered, mutable collection of key-value pairs. Dictionaries are defined using curly braces {} and are ideal for storing data that needs to be associated with unique keys.

Creating a Dictionary

my_dict = {"name": "Alice", "age": 30, "city": "New York"}

Adding, Removing, and Modifying Values

  • Adding Key-Value Pairs:

    my_dict["country"] = "USA"  # Adds a new key-value pair
  • Removing Key-Value Pairs:

    del my_dict["city"]  # Removes the key "city"
    age = my_dict.pop("age") # Removes "age" and returns its value
  • Modifying Values:

    my_dict["name"] = "Bob"  # Updates the value for the key "name"

Example:

my_dict = {"name": "Alice", "age": 30, "city": "New York"}
my_dict["country"] = "USA"
del my_dict["city"]
age = my_dict.pop("age")
my_dict["name"] = "Bob"
print(my_dict) # Output: {'name': 'Bob', 'country': 'USA'}

4. Sets

A set is an unordered collection of unique items. Sets are useful for storing items that should not have duplicates and are defined by curly braces {} or using the set() function.

Creating a Set

my_set = {1, 2, 3, 4, 5}
fruits_set = {"apple", "banana", "cherry"}

Adding, Removing, and Modifying Values

  • Adding Values:

    fruits_set.add("orange")  # Adds "orange" to the set
  • Removing Values:

    fruits_set.remove("banana")  # Removes "banana", raises KeyError if not found
    fruits_set.discard("kiwi") # Removes "kiwi", no error if not present
  • Modifying Values: You cannot modify items in a set directly, but you can remove and add new items.

    fruits_set.clear()  # Removes all items from the set

Example:

fruits_set = {"apple", "banana", "cherry"}
fruits_set.add("orange")
fruits_set.remove("banana")
fruits_set.discard("kiwi") # No error if "kiwi" is not present
fruits_set.clear()
print(fruits_set) # Output: set()

5. DataFrames (pandas)

A DataFrame is a two-dimensional, size-mutable tabular data structure provided by the pandas library. It is a widely used structure for handling and analyzing data in Python, similar to spreadsheets or SQL tables.

Creating a DataFrame

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [30, 25, 35],
"City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)

Adding, Removing, and Modifying Values

  • Adding Values:

    df["Country"] = ["USA", "USA", "USA"]  # Adds a new column
    df.loc[3] = ["David", 40, "Miami"] # Adds a new row
  • Removing Values:

    df.drop("City", axis=1, inplace=True)  # Removes the "City" column
    df.drop(0, inplace=True) # Removes the first row
  • Modifying Values:

    df.at[1, "Age"] = 26  # Updates Bob's age

Example:

import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [30, 25, 35],
"City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)

df["Country"] = ["USA", "USA", "USA"]
df.loc[3] = ["David", 40, "Miami"]
df.drop("City", axis=1, inplace=True)
df.drop(0, inplace=True)
df.at[1, "Age"] = 26
print(df)

Conclusion

Understanding Python’s core data structures—lists, tuples, dictionaries, sets, and DataFrames—is crucial for effective data manipulation and analysis. Each of these structures has unique strengths and use cases, making them essential tools for any Python programmer, especially in data science. Mastering these data types will enhance your ability to work efficiently with data, forming a solid foundation for more advanced programming tasks.