Python - Convert String to Int/Float

Key Insights

• Python provides int() and float() built-in functions for type conversion, but they raise ValueError for invalid inputs requiring proper exception handling • Locale-aware parsing with locale.atof() handles regional number formats, while ast.literal_eval() provides safe evaluation of string literals • Production code needs validation strategies including regex patterns, try-except blocks, and default values to handle edge cases like None, empty strings, and non-numeric data

Basic Conversion with int() and float()

The int() and float() functions are the standard way to convert strings to numeric types in Python. These functions parse string representations and return the corresponding numeric value.

# Basic string to int conversion
age_str = "25"
age = int(age_str)
print(age)  # 25
print(type(age))  # <class 'int'>

# Basic string to float conversion
price_str = "19.99"
price = float(price_str)
print(price)  # 19.99
print(type(price))  # <class 'float'>

# int() with different bases
binary_str = "1010"
decimal_from_binary = int(binary_str, 2)
print(decimal_from_binary)  # 10

hex_str = "FF"
decimal_from_hex = int(hex_str, 16)
print(decimal_from_hex)  # 255

The int() function accepts an optional second parameter specifying the base (2-36). Without it, base 10 is assumed. The float() function only works with base 10.

Handling Invalid Input with Exception Handling

String-to-number conversion fails when the input doesn’t represent a valid number. Production code must handle these cases explicitly.

def safe_int_conversion(value, default=0):
    """Convert string to int with fallback default."""
    try:
        return int(value)
    except (ValueError, TypeError):
        return default

def safe_float_conversion(value, default=0.0):
    """Convert string to float with fallback default."""
    try:
        return float(value)
    except (ValueError, TypeError):
        return default

# Usage examples
print(safe_int_conversion("42"))  # 42
print(safe_int_conversion("invalid"))  # 0
print(safe_int_conversion(None))  # 0
print(safe_int_conversion("abc", -1))  # -1

print(safe_float_conversion("3.14"))  # 3.14
print(safe_float_conversion("not a number"))  # 0.0
print(safe_float_conversion("", 1.0))  # 1.0

This pattern catches both ValueError (invalid format) and TypeError (None or non-string types), making it robust for real-world data.

Stripping Whitespace and Special Characters

Strings from user input or file parsing often contain whitespace or currency symbols. Clean the input before conversion.

def parse_number(value, num_type='int'):
    """Parse number from string with cleaning."""
    if value is None:
        return None
    
    # Remove whitespace
    cleaned = str(value).strip()
    
    # Remove common non-numeric characters
    cleaned = cleaned.replace(',', '')  # Thousand separators
    cleaned = cleaned.replace('$', '')  # Currency symbols
    cleaned = cleaned.replace('%', '')  # Percentage signs
    
    try:
        if num_type == 'int':
            return int(float(cleaned))  # Convert via float to handle "42.0"
        else:
            return float(cleaned)
    except (ValueError, TypeError):
        return None

# Test cases
print(parse_number("  42  "))  # 42
print(parse_number("$1,234.56", 'float'))  # 1234.56
print(parse_number("1,000", 'int'))  # 1000
print(parse_number("42.0", 'int'))  # 42

Note the two-step conversion for integers: int(float(cleaned)). This handles strings like “42.0” that int() would reject but represent valid integers.

Locale-Aware Number Parsing

Different regions use different decimal separators. European locales use commas for decimals (3,14) while US locales use periods (3.14).

import locale

def parse_localized_float(value, locale_name='en_US.UTF-8'):
    """Parse float respecting locale formatting."""
    original_locale = locale.getlocale(locale.LC_NUMERIC)
    try:
        locale.setlocale(locale.LC_NUMERIC, locale_name)
        return locale.atof(value)
    except (ValueError, locale.Error):
        return None
    finally:
        locale.setlocale(locale.LC_NUMERIC, original_locale)

# Parse European format (comma as decimal separator)
european_number = "1.234,56"
result = parse_localized_float(european_number, 'de_DE.UTF-8')
print(result)  # 1234.56

# Parse US format
us_number = "1,234.56"
result = parse_localized_float(us_number, 'en_US.UTF-8')
print(result)  # 1234.56

Always restore the original locale in the finally block to avoid affecting other parts of your application.

Validation Before Conversion

Validate string format before attempting conversion to provide better error messages and avoid exception overhead.

import re

def is_valid_int(value):
    """Check if string represents a valid integer."""
    if not isinstance(value, str):
        return False
    pattern = r'^[+-]?\d+$'
    return bool(re.match(pattern, value.strip()))

def is_valid_float(value):
    """Check if string represents a valid float."""
    if not isinstance(value, str):
        return False
    pattern = r'^[+-]?(\d+\.?\d*|\.\d+)([eE][+-]?\d+)?$'
    return bool(re.match(pattern, value.strip()))

# Validation examples
print(is_valid_int("42"))  # True
print(is_valid_int("-123"))  # True
print(is_valid_int("12.34"))  # False
print(is_valid_int("abc"))  # False

print(is_valid_float("3.14"))  # True
print(is_valid_float("-2.5e10"))  # True
print(is_valid_float(".5"))  # True
print(is_valid_float("1.2.3"))  # False

The float regex handles scientific notation and leading/trailing decimal points, matching Python’s float() behavior.

Converting Lists and Bulk Data

When processing CSV files or API responses, convert multiple strings efficiently.

def convert_string_list(string_list, converter=int, skip_invalid=True):
    """Convert list of strings to numbers."""
    results = []
    errors = []
    
    for i, value in enumerate(string_list):
        try:
            results.append(converter(value))
        except (ValueError, TypeError) as e:
            errors.append((i, value, str(e)))
            if not skip_invalid:
                raise
    
    return results, errors

# Example usage
data = ["42", "3.14", "100", "invalid", "256"]

# Convert to integers, skipping invalid
integers, int_errors = convert_string_list(data, int, skip_invalid=True)
print(integers)  # [42, 100, 256]
print(int_errors)  # [(1, '3.14', ...), (3, 'invalid', ...)]

# Convert to floats
floats, float_errors = convert_string_list(data, float, skip_invalid=True)
print(floats)  # [42.0, 3.14, 100.0, 256.0]
print(float_errors)  # [(3, 'invalid', ...)]

# List comprehension alternative for simple cases
valid_ints = [int(x) for x in data if x.isdigit()]
print(valid_ints)  # [42, 100, 256]

This approach tracks both successful conversions and errors, essential for data quality monitoring.

Using ast.literal_eval for Safe Evaluation

The ast.literal_eval() function safely evaluates string literals, useful when strings might contain Python literal representations.

import ast

def safe_eval_number(value):
    """Safely evaluate string to number using AST."""
    try:
        result = ast.literal_eval(value)
        if isinstance(result, (int, float)):
            return result
        return None
    except (ValueError, SyntaxError):
        return None

# Works with various formats
print(safe_eval_number("42"))  # 42
print(safe_eval_number("3.14"))  # 3.14
print(safe_eval_number("-100"))  # -100
print(safe_eval_number("0xFF"))  # 255 (hex literal)
print(safe_eval_number("0b1010"))  # 10 (binary literal)

# Safely rejects code execution
print(safe_eval_number("__import__('os')"))  # None
print(safe_eval_number("[1, 2, 3]"))  # None (list, not number)

Unlike eval(), ast.literal_eval() only evaluates literals and won’t execute arbitrary code, making it safe for untrusted input.

Pandas Integration for DataFrames

When working with data analysis, Pandas provides optimized conversion methods.

import pandas as pd
import numpy as np

# Create DataFrame with string numbers
df = pd.DataFrame({
    'id': ['1', '2', '3', '4'],
    'price': ['19.99', '29.99', 'invalid', '39.99'],
    'quantity': ['5', '10', '15', 'N/A']
})

# Convert with pd.to_numeric()
df['id'] = pd.to_numeric(df['id'], errors='raise')  # Raises on error
df['price'] = pd.to_numeric(df['price'], errors='coerce')  # NaN for invalid
df['quantity'] = pd.to_numeric(df['quantity'], errors='ignore')  # Keep original

print(df)
print(df.dtypes)

# Bulk conversion with astype()
df_numeric = df.copy()
df_numeric['id'] = df_numeric['id'].astype(int)

# Handle missing values during conversion
df['price_filled'] = pd.to_numeric(df['price'], errors='coerce').fillna(0.0)
print(df['price_filled'])  # [19.99, 29.99, 0.0, 39.99]

The errors parameter in pd.to_numeric() provides three strategies: raise (throw exception), coerce (convert to NaN), and ignore (return original).