How to Remove Non-Printable Characters from Strings in Python?

Front page > Programming > How to Remove Non-Printable Characters from Strings in Python?

How to Remove Non-Printable Characters from Strings in Python?

Published on 2024-11-09

Browse:902

How to Remove Non-Printable Characters from Strings in Python?

Stripping Non-Printable Characters from a String in Python

In contrast to Perl, Python lacks POSIX regex classes, making it challenging to detect and remove non-printable characters using regular expressions.

So, how can you achieve this in Python?

One approach is to leverage the unicodedata module. The unicodedata.category function classifies Unicode characters into various categories. For instance, characters categorized as Cc (control) represent non-printable characters.

Using this knowledge, you can construct a custom character class that matches all control characters:

import unicodedata
import re
import sys

all_chars = (chr(i) for i in range(sys.maxunicode))
categories = {'Cc'}
control_chars = ''.join(c for c in all_chars if unicodedata.category(c) in categories)

control_char_re = re.compile('[%s]' % re.escape(control_chars))

def remove_control_chars(s):
    return control_char_re.sub('', s)

This function effectively strips all non-printable ASCII characters from the input string.

Alternatively, you can use Python's built-in string.printable method to filter out non-printable characters. However, this method excludes Unicode characters, so it may not suit all use cases.

To handle Unicode characters, you can expand the character class in the regular expression as follows:

control_chars = ''.join(map(chr, itertools.chain(range(0x00,0x20), range(0x7f,0xa0))))

This extended character class encompasses the basic control characters along with common non-printable Unicode characters.

By modifying the remove_control_chars function accordingly, you can successfully handle both ASCII and Unicode non-printable characters.

Release Statement This article is reprinted at: 1729551315 If there is any infringement, please contact [email protected] to delete it

Latest tutorial More>

Method to correctly convert Latin1 characters to UTF8 in UTF8 MySQL table
Convert Latin1 Characters in a UTF8 Table to UTF8You've encountered an issue where characters with diacritics (e.g., "Jáuò Iñe") were in...

Programming Posted on 2025-07-14
How to efficiently INSERT or UPDATE rows based on two conditions in MySQL?
INSERT INTO or UPDATE with Two ConditionsProblem Description:The user encounters a time-consuming challenge: inserting a new row into a table if there...

Programming Posted on 2025-07-14
How Can I UNION Database Tables with Different Numbers of Columns?
Combined tables with different columns] Can encounter challenges when trying to merge database tables with different columns. A straightforward way i...

Programming Posted on 2025-07-14
$Solve the \\"String value error\\" exception when MySQL inserts Emoji$
Solve the \\"String value error\\" exception when MySQL inserts Emoji
Resolving Incorrect String Value Exception When Inserting EmojiWhen attempting to insert a string containing emoji characters into a MySQL database us...

Programming Posted on 2025-07-14
Why Doesn't `body { margin: 0; }` Always Remove Top Margin in CSS?
Addressing Body Margin Removal in CSSFor novice web developers, removing the margin of the body element can be a confusing task. Often, the code provi...

Programming Posted on 2025-07-14
Guide to Solve CORS Issues in Spring Security 4.1 and above
Spring Security CORS Filter: Troubleshooting Common IssuesWhen integrating Spring Security into an existing project, you may encounter CORS-related er...

Programming Posted on 2025-07-14
How to Capture and Stream stdout in Real Time for Chatbot Command Execution?
Capturing stdout in Real Time from Command ExecutionIn the realm of developing chatbots capable of executing commands, a common requirement is the abi...

Programming Posted on 2025-07-14
Why do images still have borders in Chrome? `border: none;` invalid solution
Removing the Image Border in ChromeOne frequent issue encountered when working with images in Chrome and IE9 is the appearance of a persistent thin bo...

Programming Posted on 2025-07-14
Why do Lambda expressions require "final" or "valid final" variables in Java?
Lambda Expressions Require "Final" or "Effectively Final" VariablesThe error message "Variable used in lambda expression shou...

Programming Posted on 2025-07-14
How to create dynamic variables in Python?
Dynamic Variable Creation in PythonThe ability to create variables dynamically can be a powerful tool, especially when working with complex data struc...

Programming Posted on 2025-07-14
Python Read CSV File UnicodeDecodeError Ultimate Solution
Unicode Decode Error in CSV File ReadingWhen attempting to read a CSV file into Python using the built-in csv module, you may encounter an error stati...

Programming Posted on 2025-07-14
CSS strongly typed language analysis
One of the ways you can classify a programming language is by how strongly or weakly typed it is. Here, “typed” means if variables are known at compil...

Programming Posted on 2025-07-14
How Can I Maintain Custom JTable Cell Rendering After Cell Editing?
Maintaining JTable Cell Rendering After Cell EditIn a JTable, implementing custom cell rendering and editing capabilities can enhance the user experie...

Programming Posted on 2025-07-14
Async Void vs. Async Task in ASP.NET: Why does the Async Void method sometimes throw exceptions?
Understanding the Distinction Between Async Void and Async Task in ASP.NetIn ASP.Net applications, asynchronous programming plays a crucial role in en...

Programming Posted on 2025-07-14
`console.log` shows the reason for the modified object value exception
Objects and Console.log: An Oddity UnraveledWhen working with objects and console.log, you may encounter peculiar behavior. Let's unravel this mys...

Programming Posted on 2025-07-14