How to Use CLEAN in Excel
• CLEAN removes non-printable ASCII characters (0-31) from text, making it essential for sanitizing data imported from external systems, databases, or web sources
Key Insights
• CLEAN removes non-printable ASCII characters (0-31) from text, making it essential for sanitizing data imported from external systems, databases, or web sources • Always combine CLEAN with TRIM for comprehensive text cleanup, since CLEAN doesn’t handle extra spaces and TRIM doesn’t remove non-printable characters • CLEAN has limitations—it won’t remove visible special characters like symbols or Unicode characters, requiring SUBSTITUTE or custom solutions for those scenarios
Introduction to the CLEAN Function
The CLEAN function is Excel’s specialized tool for removing non-printable characters from text strings. These invisible characters—ASCII characters 0 through 31—often sneak into your data when importing from databases, web APIs, legacy systems, or poorly formatted CSV files. They cause problems you can’t see: formulas that should match don’t, sorting behaves unexpectedly, and data exports fail validation.
I’ve seen datasets where everything looks perfect on screen, but VLOOKUP returns #N/A because of hidden line breaks. Or where concatenated addresses produce bizarre spacing because of embedded carriage returns. CLEAN solves these problems by stripping out the invisible troublemakers that corrupt your data.
The most common use cases include cleaning data from SQL database exports, sanitizing web-scraped content, fixing CSV imports from mainframe systems, and preparing data for APIs that reject non-printable characters. If you’re working with external data sources, CLEAN should be in your standard data processing toolkit.
Syntax and Basic Usage
The CLEAN function has the simplest syntax in Excel:
=CLEAN(text)
The function takes a single argument—the text you want to clean—and returns that text with all non-printable characters removed. Here’s a basic example:
=CLEAN(A1)
Let’s say cell A1 contains text imported from a database that includes embedded line breaks (character 10) and carriage returns (character 13). The cell might display like this:
Customer Name:
John Smith
Account: 12345
Applying =CLEAN(A1) produces:
Customer Name:John SmithAccount: 12345
The line breaks disappear, leaving continuous text. This is exactly what you want when preparing data for systems that can’t handle multi-line cell content.
Common Scenarios for Using CLEAN
Cleaning Web-Scraped Data
When you import data from websites using Power Query or web scraping tools, you’ll often get HTML entities, line breaks, and tab characters mixed into your text. CLEAN removes these non-printable characters before you process the data further.
Preparing Data for Export
Many external systems reject files containing non-printable characters. Before exporting to CSV for upload to accounting systems, CRMs, or e-commerce platforms, run CLEAN on all text columns to ensure compatibility.
Fixing CSV Import Issues
Legacy systems and mainframe exports often include control characters that cause Excel to misinterpret column breaks or create phantom rows. Here’s a before-and-after comparison:
Before (Raw Import):
ProductID Description
100 Widget[CHAR(10)]Premium
101 Gadget[CHAR(13)]Standard
After CLEAN:
=CLEAN(B2)
Result:
ProductID Description
100 WidgetPremium
101 GadgetStandard
The non-printable characters are gone, but notice the words run together. This is where combining functions becomes critical.
Combining CLEAN with Other Functions
CLEAN rarely works alone in production scenarios. You’ll typically chain it with TRIM, SUBSTITUTE, UPPER, LOWER, or PROPER for complete data normalization.
CLEAN + TRIM for Complete Whitespace Handling
=TRIM(CLEAN(A1))
This combination is the gold standard for text cleanup. CLEAN removes non-printable characters, then TRIM removes leading, trailing, and excess internal spaces. If A1 contains:
Customer Name[CHAR(10)] with spaces
The formula returns:
Customer Name with spaces
Comprehensive Text Normalization
For a complete data cleaning pipeline, nest multiple functions:
=UPPER(TRIM(CLEAN(A1)))
This formula:
- Removes non-printable characters (CLEAN)
- Removes extra spaces (TRIM)
- Converts to uppercase (UPPER)
For proper case names, use:
=PROPER(TRIM(CLEAN(A1)))
Practical Multi-Column Cleanup
When cleaning imported contact data:
=TRIM(CLEAN(A2))&" "&TRIM(CLEAN(B2))
This concatenates first and last names from separate columns while ensuring both are properly cleaned.
CLEAN vs. TRIM: Understanding the Difference
Understanding what each function removes prevents confusion and helps you choose the right tool.
CLEAN removes:
- Non-printable ASCII characters (0-31)
- Line breaks (CHAR(10))
- Carriage returns (CHAR(13))
- Tab characters (CHAR(9))
- Other control characters
TRIM removes:
- Leading spaces
- Trailing spaces
- Multiple consecutive spaces (reduces to single space)
CLEAN does NOT remove:
- Regular spaces
- Visible special characters (@, #, $, etc.)
- Unicode characters
TRIM does NOT remove:
- Non-printable characters
- Special characters
- Single spaces between words
Here’s a side-by-side demonstration:
Original (A1): " Text[CHAR(10)]with spaces "
=CLEAN(A1): " Textwith spaces "
=TRIM(A1): "Text[CHAR(10)]with spaces"
=TRIM(CLEAN(A1)): "Textwith spaces"
Notice that CLEAN preserves the extra spaces but removes the line break, while TRIM removes spaces but preserves the line break. Only the combination handles both issues.
Limitations and Alternatives
CLEAN is powerful but limited. It only removes ASCII characters 0-31. It won’t touch:
- Visible special characters (©, ®, ™, •, §)
- Unicode characters
- Specific symbols you want to remove
- Non-breaking spaces (CHAR(160))
Removing Specific Characters with SUBSTITUTE
To remove visible special characters that CLEAN ignores:
=SUBSTITUTE(CLEAN(A1), "®", "")
This removes the registered trademark symbol after cleaning non-printable characters.
Removing Multiple Specific Characters
Nest SUBSTITUTE calls for multiple replacements:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CLEAN(A1), "®", ""), "©", ""), "™", "")
Handling Non-Breaking Spaces
Non-breaking spaces (CHAR(160)) are common in web data but CLEAN doesn’t remove them:
=TRIM(SUBSTITUTE(CLEAN(A1), CHAR(160), " "))
This replaces non-breaking spaces with regular spaces, then TRIM removes excess spacing.
When You Need VBA
For complex pattern matching or removing entire character classes, VBA provides more control:
Function DeepClean(txt As String) As String
Dim i As Integer
Dim result As String
result = ""
For i = 1 To Len(txt)
If Asc(Mid(txt, i, 1)) >= 32 And Asc(Mid(txt, i, 1)) <= 126 Then
result = result & Mid(txt, i, 1)
End If
Next i
DeepClean = result
End Function
This custom function keeps only standard printable ASCII characters (32-126), removing everything else.
Best Practices and Tips
Apply CLEAN Early in Your Workflow
Clean data immediately after import, before any analysis or calculations. Create a “Raw Data” sheet and a “Cleaned Data” sheet with CLEAN formulas applied to all text columns. This prevents non-printable characters from propagating through your workbook.
Use with Data Validation
After cleaning, apply data validation to prevent users from pasting problematic data:
- Select your data range
- Data > Data Validation
- Set criteria that reject entries with non-printable characters
Performance Considerations
CLEAN is computationally inexpensive, but with datasets over 100,000 rows, consider:
- Converting formulas to values after cleaning (Copy > Paste Special > Values)
- Using Power Query for large-scale data transformation
- Cleaning data before importing into Excel when possible
Create Reusable Cleaning Formulas
Build a standard cleaning formula for your organization:
=PROPER(TRIM(SUBSTITUTE(CLEAN(A1), CHAR(160), " ")))
Document this as your standard text normalization function and use it consistently across all data imports.
Test Your Cleaning
After applying CLEAN, verify the results:
=LEN(A1)-LEN(CLEAN(A1))
This shows how many characters were removed. If the result is 0, either your data was already clean or CLEAN can’t handle the characters present.
The CLEAN function is simple but essential. Master it, understand its limitations, and combine it strategically with other text functions. Your data quality will improve immediately, and you’ll spend less time troubleshooting mysterious data issues caused by invisible characters.