Python - Data Types Overview | Application Architect

Key Insights

Python’s dynamic typing provides flexibility, but understanding the underlying data types is essential for writing performant, bug-free code—especially when dealing with mutability, hashability, and type coercion.
Choosing the right data structure (list vs tuple, dict vs set) has real performance implications; lists are O(n) for membership testing while sets are O(1).
Python’s truthy/falsy evaluation and None handling are common sources of subtle bugs; always use is for None comparisons and be explicit about what constitutes “empty” in your domain.

Introduction to Python’s Type System

Python is dynamically typed, meaning you don’t declare variable types explicitly—the interpreter figures it out at runtime. This doesn’t mean Python is weakly typed; it’s actually strongly typed. You can’t add a string to an integer without explicit conversion.

x = 42        # x is an int
x = "hello"   # now x is a str—no error, but the type changed

# This will raise TypeError, not silently coerce
result = "age: " + 42  # TypeError: can only concatenate str to str

Since Python 3.5, type hints have become increasingly important for documentation and static analysis:

def calculate_total(prices: list[float], tax_rate: float) -> float:
    return sum(prices) * (1 + tax_rate)

Type hints don’t enforce anything at runtime—they’re for tooling like mypy and for human readers. Understanding Python’s built-in data types matters because it affects memory usage, performance characteristics, and what operations are legal on your data.

Numeric Types: int, float, complex

Python has three numeric types, and they behave differently than you might expect coming from other languages.

Integers in Python have unlimited precision. There’s no overflow:

big_number = 10 ** 100  # This works fine
print(big_number)       # A googol, all 101 digits

# Useful integer operations
print(17 // 5)   # 3 (floor division)
print(17 % 5)    # 2 (modulo)
print(divmod(17, 5))  # (3, 2) (both at once)

Floats are IEEE 754 double-precision, which means they have the usual precision issues:

# Classic floating-point gotcha
print(0.1 + 0.2)  # 0.30000000000000004

# For financial calculations, use Decimal
from decimal import Decimal
price = Decimal("19.99")
tax = Decimal("0.08")
total = price * (1 + tax)  # Exact arithmetic

Complex numbers are built-in, which is unusual for a general-purpose language:

z = 3 + 4j
print(z.real)      # 3.0
print(z.imag)      # 4.0
print(abs(z))      # 5.0 (magnitude)
print(z.conjugate())  # (3-4j)

Type coercion happens automatically in mixed arithmetic—int promotes to float, float promotes to complex:

result = 5 + 2.5    # int + float = float (7.5)
result = 2.5 + 3j   # float + complex = complex (2.5+3j)

Text Type: str

Strings in Python 3 are Unicode by default and immutable. Every string operation that “modifies” a string actually creates a new one.

# String creation
single = 'hello'
double = "hello"
multiline = """This spans
multiple lines"""
raw = r"C:\Users\name"  # raw string, backslashes are literal

# Immutability means this creates a new string
name = "alice"
name_upper = name.upper()  # "ALICE" (new string)
print(name)  # still "alice"

F-strings (Python 3.6+) are the preferred way to format strings:

user = "alice"
balance = 1234.5678

# Basic interpolation
print(f"User: {user}, Balance: ${balance:.2f}")

# Expressions inside f-strings
items = ["a", "b", "c"]
print(f"Count: {len(items)}")

# Debug formatting (Python 3.8+)
x = 42
print(f"{x=}")  # prints "x=42"

Common string methods you’ll use constantly:

text = "  hello, world  "

# Cleaning
print(text.strip())        # "hello, world"
print(text.lstrip())       # "hello, world  "

# Splitting and joining
words = "one,two,three".split(",")  # ["one", "two", "three"]
rejoined = "-".join(words)          # "one-two-three"

# Searching
print("world" in text)              # True
print(text.find("world"))           # 9 (index)
print(text.replace("world", "python"))  # "  hello, python  "

# Slicing
s = "python"
print(s[0:3])    # "pyt"
print(s[-2:])    # "on"
print(s[::-1])   # "nohtyp" (reversed)

Sequence Types: list, tuple, range

Lists are mutable, ordered sequences. They’re your workhorse collection:

# Creation and modification
numbers = [1, 2, 3]
numbers.append(4)        # [1, 2, 3, 4]
numbers.extend([5, 6])   # [1, 2, 3, 4, 5, 6]
numbers.insert(0, 0)     # [0, 1, 2, 3, 4, 5, 6]
popped = numbers.pop()   # 6, list is now [0, 1, 2, 3, 4, 5]

# List comprehensions—learn these, use them
squares = [x**2 for x in range(10)]
evens = [x for x in range(20) if x % 2 == 0]
matrix = [[i*j for j in range(3)] for i in range(3)]

Tuples are immutable sequences. Use them for fixed collections and as dictionary keys:

# Creation (parentheses optional but recommended)
point = (3, 4)
single = (42,)  # Note the comma—without it, it's just an int in parentheses

# Tuple unpacking is powerful
x, y = point
first, *rest = [1, 2, 3, 4]  # first=1, rest=[2, 3, 4]
a, *middle, z = [1, 2, 3, 4, 5]  # a=1, middle=[2, 3, 4], z=5

# Named tuples for readable code
from collections import namedtuple
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y)  # 3 4

Range is a lazy sequence—it doesn’t store all values in memory:

# Memory efficient iteration
for i in range(1_000_000):  # doesn't create a million-element list
    if i > 5:
        break
    print(i)

# Range supports slicing and membership testing
r = range(0, 100, 2)  # even numbers 0-98
print(50 in r)        # True (O(1) check, not iteration)
print(list(r[10:15])) # [20, 22, 24, 26, 28]

Mapping Type: dict

Dictionaries are hash tables mapping keys to values. Keys must be hashable (immutable).

# Creation
user = {"name": "alice", "age": 30}
user = dict(name="alice", age=30)  # equivalent

# Access with defaults
print(user.get("email", "not set"))  # "not set" (no KeyError)

# Dictionary comprehensions
squares = {x: x**2 for x in range(5)}  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

# Merging dictionaries (Python 3.9+)
defaults = {"theme": "dark", "language": "en"}
overrides = {"language": "es"}
config = defaults | overrides  # {"theme": "dark", "language": "es"}

# Iteration patterns
for key in user:
    print(key)
for key, value in user.items():
    print(f"{key}: {value}")

Common patterns with dictionaries:

# Counting with defaultdict
from collections import defaultdict
counter = defaultdict(int)
for word in ["a", "b", "a", "c", "a"]:
    counter[word] += 1  # no KeyError on first access

# Grouping
groups = defaultdict(list)
for item in [("a", 1), ("b", 2), ("a", 3)]:
    groups[item[0]].append(item[1])
# {"a": [1, 3], "b": [2]}

# setdefault for single operations
cache = {}
value = cache.setdefault("key", expensive_computation())

Set Types: set, frozenset

Sets are unordered collections of unique, hashable elements. They’re optimized for membership testing and mathematical set operations.

# Creation and deduplication
numbers = {1, 2, 3, 2, 1}  # {1, 2, 3}
unique = set([1, 2, 2, 3, 3, 3])  # {1, 2, 3}

# Fast membership testing (O(1) vs O(n) for lists)
valid_statuses = {"pending", "active", "completed"}
if status in valid_statuses:  # fast lookup
    process(status)

# Set operations
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
print(a | b)  # union: {1, 2, 3, 4, 5, 6}
print(a & b)  # intersection: {3, 4}
print(a - b)  # difference: {1, 2}
print(a ^ b)  # symmetric difference: {1, 2, 5, 6}

# Practical: find common elements
users_a = {"alice", "bob", "charlie"}
users_b = {"bob", "diana", "charlie"}
common = users_a & users_b  # {"bob", "charlie"}

Frozensets are immutable sets—use them when you need a set as a dictionary key:

# Frozenset as dictionary key
permissions = {
    frozenset({"read"}): "viewer",
    frozenset({"read", "write"}): "editor",
    frozenset({"read", "write", "admin"}): "admin"
}

Boolean and None Types

Python’s boolean type has exactly two values: True and False. But many values are “truthy” or “falsy”:

# Falsy values
bool(0)        # False
bool(0.0)      # False
bool("")       # False
bool([])       # False
bool({})       # False
bool(None)     # False

# Everything else is truthy
bool(1)        # True
bool("hello")  # True
bool([0])      # True (non-empty list, even if contents are falsy)

None is Python’s null value. Always check for it with is, not ==:

def find_user(user_id):
    # Returns User or None
    return database.get(user_id)

user = find_user(123)

# Wrong: uses __eq__, could be overridden
if user == None:
    pass

# Right: identity check
if user is None:
    handle_missing_user()

# Common pattern: default arguments
def greet(name=None):
    if name is None:
        name = "stranger"
    print(f"Hello, {name}")

Watch out for mutable default arguments:

# Bug: default list is shared across calls
def append_to(item, target=[]):
    target.append(item)
    return target

# Fix: use None as sentinel
def append_to(item, target=None):
    if target is None:
        target = []
    target.append(item)
    return target

Understanding these data types deeply will help you write cleaner, faster Python code. The right choice of data structure often makes the difference between elegant solutions and convoluted workarounds.