python正则表达式re.sub()函数使用示例——移除所有非打印字符(r‘[^\x20-\x7E]‘)、移除所有非字母数字(r‘[^a-zA-Z0-9]‘)去除乱码
每个十六进制数对应于ASCII码表中的一个特定字符,从32 (20) 的空格字符开始,到126 (7E) 的波浪号 (~) 结束。非打印字符表示所有不在 ASCII 打印字符范围(十六进制值从 20 到 7E)内的字符。(后面附录有可打印字符供参考)十六进制值从20到7E包含了ASCII码表中的大部分可打印字符。
文章目录
读取英伟达jetson盒子序列号的时候比较蛋疼,里面居然有非打印字符
@app.route('/get_serial_number', methods=['GET'])
def http_get_serial_number():
print('')
print(f'--------------- request: {request} ---------------')
# 读取序列号文件
try:
with open('/sys/firmware/devicetree/base/serial-number', 'r') as file:
# 移除可能的前后空白字符
serial_number = file.read().strip()
except Exception as e:
print(f"Error reading serial number: {e}")
return jsonify({'ErrorCode': -1, 'ErrorInfo': f'Unable to read serial number, Exception: [{e}]'}), 200
# 创建包含序列号的字典
serial_info = {'boxNo': serial_number, 'ErrorCode': 0}
print('current serial_info:')
pprint.pprint(serial_info)
return jsonify(serial_info)
移除所有非打印字符(r'[^\x20-\x7E]'
)
可以使用以下代码移除所有非打印字符:
serial_number = re.sub(r'[^\x20-\x7E]', '', serial_number)
非打印字符表示所有不在 ASCII 打印字符范围(十六进制值从 20 到 7E)内的字符。(后面附录有可打印字符供参考)
代码:
@app.route('/get_serial_number', methods=['GET'])
def http_get_serial_number():
print('')
print(f'--------------- request: {request} ---------------')
# 读取序列号文件
try:
with open('/sys/firmware/devicetree/base/serial-number', 'r') as file:
# 移除可能的前后空白字符
serial_number = file.read().strip()
# 使用正则表达式移除特殊字符(读取后发现确实有非打印字符,需要替换掉,)
serial_number = re.sub(r'[^\x20-\x7E]', '', serial_number) # 移除所有非打印字符
except Exception as e:
print(f"Error reading serial number: {e}")
return jsonify({'ErrorCode': -1, 'ErrorInfo': f'Unable to read serial number, Exception: [{e}]'}), 200
# 创建包含序列号的字典
serial_info = {'boxNo': serial_number, 'ErrorCode': 0}
print('current serial_info:')
pprint.pprint(serial_info)
return jsonify(serial_info)
结果:
移除所有非字母数字((r'[^a-zA-Z0-9]')
)
如果想更严格一点,可以移除所有非字母数字:
serial_number = re.sub(r'[^a-zA-Z0-9]', '', serial_number)
附录:所有可打印字符(反面就是所有非打印字符)
十六进制值从20到7E包含了ASCII码表中的大部分可打印字符。这些字符包括:
- 20: 空格 (Space)
- 21: !
- 22: "
- 23: #
- 24: $
- 25: %
- 26: &
- 27: ’
- 28: (
- 29: )
- 2A: *
- 2B: +
- 2C: ,
- 2D: -
- 2E: .
- 2F: /
接着是数字0-9:
- 30: 0
- 31: 1
- 32: 2
- 33: 3
- 34: 4
- 35: 5
- 36: 6
- 37: 7
- 38: 8
- 39: 9
然后是英文冒号和分号等符号:
- 3A: :
- 3B: ;
- 3C: <
- 3D: =
- 3E: >
- 3F: ?
- 40: @
接下来是大写字母A-Z:
- 41: A
- 42: B
- 43: C
- 44: D
- 45: E
- 46: F
- 47: G
- 48: H
- 49: I
- 4A: J
- 4B: K
- 4C: L
- 4D: M
- 4E: N
- 4F: O
- 50: P
- 51: Q
- 52: R
- 53: S
- 54: T
- 55: U
- 56: V
- 57: W
- 58: X
- 59: Y
- 5A: Z
随后是一些常见的标点符号:
- 5B: [
- 5C: \
- 5D: ]
- 5E: ^
- 5F: _
- 60: `
最后是小写字母a-z:
- 61: a
- 62: b
- 63: c
- 64: d
- 65: e
- 66: f
- 67: g
- 68: h
- 69: i
- 6A: j
- 6B: k
- 6C: l
- 6D: m
- 6E: n
- 6F: o
- 70: p
- 71: q
- 72: r
- 73: s
- 74: t
- 75: u
- 76: v
- 77: w
- 78: x
- 79: y
- 7A: z
以及更多的标点符号和特殊字符:
- 7B: {
- 7C: |
- 7D: }
- 7E: ~
每个十六进制数对应于ASCII码表中的一个特定字符,从32 (20) 的空格字符开始,到126 (7E) 的波浪号 (~) 结束。
ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ
ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ ᅟᅠ
更多推荐
所有评论(0)