《Go Cookbook CN》系列 06:字符串创建、转换与核心操作全攻略
本文聚焦Go语言字符串的核心操作体系,从字符串的创建方式、类型转换逻辑,到拼接、数字互转、替换、子串提取等高频场景落地,全方位拆解Go字符串的底层特性与实操方法。学完本文后,你将彻底掌握Go字符串的核心操作技巧,能够高效处理各类字符串场景需求。
【本篇核心收获】
- 掌握Go语言中字符串的3种创建方式,理解byte与rune的底层区别
- 熟练完成string与byte数组的双向转换,掌握多种字符串拼接方式及性能差异
- 精通字符串与数字的互转方法,能处理转换过程中的错误场景
- 掌握字符串替换、子串提取、包含检查、拆分合并、修剪的核心函数用法
- 学会命令行字符串输入捕获、HTML转义及正则表达式处理字符串的实战技巧
6.0 字符串核心认知铺垫
6.0.1 字符串的底层本质
字符串操作是Go语言开发中最常见的任务之一,无论是用户交互还是机器通信,都离不开字符串处理。Go语言提供了多个专用包支撑字符串操作:
strconv:专注字符串与其他类型的转换fmt:通过占位符格式化字符串(类似C语言)unicode/utf8/unicode/utf16:处理Unicode编码字符串strings:包含绝大多数字符串常用操作函数
在Go语言中,字符串是只读的字节切片,可包含任意字节(无需遵循特定编码),这与其他语言中“字符序列”的定义不同。Go中无专门的char数据类型,而是通过两种类型表示字符:
byte:uint8的别名,对应ASCII字符(单字节)rune:int32的别名,对应UTF-8编码的Unicode字符(多字节)
需要注意:索引字符串时获取的是字节而非字符,一个Unicode字符可能由多个字节/码点组成。
6.1 字符串的创建方式
6.1.1 单行字符串(双引号)
使用双引号""创建单行字符串,支持转义字符(如\n换行、\"转义双引号):
// 基础单行字符串
var str = "A simple string"
// 包含换行转义符
var strWithNewline = "A simple string\n"
// 包含转义双引号
var strWithQuote = "A \"simple\" string"6.1.2 多行原始字符串(反引号)
使用反引号``创建“原始字符串”,忽略所有转义字符,支持多行编写(双引号多行会报语法错误):
// 错误示例:双引号无法直接创建多行字符串
// var str = "
// A simple string
// "
// 正确示例:反引号创建多行原始字符串
var str = `
A simple string
`反引号内的内容会被编译器原样保留,不做任何格式处理。
6.1.3 单个字符(单引号)
单引号用于创建单个字符,默认类型为rune(int32),可显式指定为byte:
// 默认类型为rune(int32)
var c = 'A'
// 显式指定为byte类型
var cByte byte = 'A'6.2 String与Byte的相互转换
Go字符串本质是byte切片,可通过类型转换直接完成双向转换,无额外函数依赖。
6.2.1 字符串转Byte数组
str := "This is a simple string" // 定义字符串变量
bytes := []byte(str) // 转换为byte数组6.2.2 Byte数组转字符串
// 手动定义byte数组(对应"This is a simple string")
bytes := []byte{84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 105, 109, 112, 108, 101, 32, 115, 116, 114, 105, 110, 103}
str := string(bytes) // 转换为字符串6.3 字符串的拼接与创建(从其他数据/字符串创建)
从已有字符串或数据创建新字符串有多种方式,不同方式性能差异显著。
6.3.1 直接拼接(+运算符)
最简单的方式,性能最优:
var str string = "The time is " + time.Now().Format(time.Kitchen) + " now."拼接结果示例:The time is 5:28PM now.
6.3.2 strings.Join函数
接收字符串数组和分隔符,将数组元素拼接为单个字符串:
var str string = strings.Join([]string{"The time is", time.Now().Format(time.Kitchen), "now."}, "")6.3.3 fmt.Sprint/Sprintf
fmt.Sprint:接收任意类型参数,直接拼接为字符串fmt.Sprintf:通过格式占位符(如%v)格式化拼接,更灵活
// fmt.Sprint
var str1 string = fmt.Sprint("The time is ", time.Now().Format(time.Kitchen), " now.")
// fmt.Sprintf(格式占位符)
var str2 string = fmt.Sprintf("The time is %v now.", time.Now())6.3.4 strings.Builder
适合大量数据拼接场景,需分步写入后提取结果:
// 方式1:WriteString写入字符串
var builder1 strings.Builder
builder1.WriteString("The time is ")
builder1.WriteString(time.Now().Format(time.Kitchen))
builder1.WriteString(" now.")
var str3 string = builder1.String()
// 方式2:fmt.Fprint写入任意类型
var builder2 strings.Builder
fmt.Fprint(&builder2, "The time is ")
fmt.Fprint(&builder2, time.Now())
fmt.Fprint(&builder2, " now.")
var str4 string = builder2.String()strings.Builder还支持Write(byte数组)、WriteByte(单个字节)、WriteRune(单个rune)方法。
6.3.5 各方式性能对比
以下是基准测试代码及结果,结论:直接拼接(+)效率最高,fmt系列(尤其是接收任意类型参数时)效率最低。
基准测试代码:
package string
import (
"fmt"
"strings"
"testing"
"time"
)
func BenchmarkStringConcat(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = "The time is " + time.Now().Format(time.Kitchen) + " now."
}
}
func BenchmarkStringJoin(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = strings.Join([]string{"The time is", time.Now().Format(time.Kitchen), "now."}, "")
}
}
func BenchmarkStringSprint(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = fmt.Sprint("The time is ", time.Now().Format(time.Kitchen), "now.")
}
}
func BenchmarkStringSprintDiff(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = fmt.Sprint("The time is ", time.Now(), " now.")
}
}
func BenchmarkStringSprintf(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = fmt.Sprintf("The time is %v now.", time.Now().Format(time.Kitchen))
}
}
func BenchmarkStringSprintfDiff(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = fmt.Sprintf("The time is %s now.", time.Now())
}
}
func BenchmarkStringBuilderFprint(b *testing.B) {
for i := 0; i < b.N; i++ {
var builder strings.Builder
fmt.Fprint(&builder, "The time is ")
fmt.Fprint(&builder, time.Now())
fmt.Fprint(&builder, " now.")
_ = builder.String()
}
}
func BenchmarkStringBuilderWriteString(b *testing.B) {
for i := 0; i < b.N; i++ {
var builder strings.Builder
builder.WriteString("The time is ")
builder.WriteString(time.Now().Format(time.Kitchen))
builder.WriteString(" now.")
_ = builder.String()
}
}测试命令:
go test -bench=BenchmarkString测试结果:
goos: darwin
goarch: arm64
pkg: github.com/sausheong/gocookbook/ch06_string
BenchmarkStringConcat-10 5787976 206.7 ns/op
BenchmarkStringJoin-10 5121637 235.0 ns/op
BenchmarkStringSprint-10 3680838 323.8 ns/op
BenchmarkStringSprintDiff-10 1541514 779.9 ns/op
BenchmarkStringSprintf-10 4032438 297.8 ns/op
BenchmarkStringSprintfDiff-10 1610212 740.9 ns/op
BenchmarkStringBuilderFprint-10 2580783 464.2 ns/op
BenchmarkStringBuilderWriteString-10 4866556 247.0 ns/op
PASS
ok github.com/sausheong/gocookbook/ch06_string 13.025s6.4 字符串与数字的互转
依赖strconv包,核心分为两组函数:
Parse系列:字符串转数字(读取字符串)Format系列:数字转字符串(创建字符串)
6.4.1 字符串转数字
(1)转整数
Atoi:便捷版,等效于ParseInt(..., 10, 0)ParseInt:通用版,支持指定基数(2-36)和bitSize(0-64)
// Atoi(字符串转int)
i, err := strconv.Atoi("123")
// ParseInt(支持有符号/无符号,指定基数和位大小)
i2, err := strconv.ParseInt("123", 10, 0)(2)转浮点数
ParseFloat:指定bitSize(32=float32,64=float64),返回值始终为float64:
f, err := strconv.ParseFloat("1.234", 64)(3)转布尔值
ParseBool:支持1/t/T/TRUE/true/True(真)、0/f/F/FALSE/false/False(假):
b, err := strconv.ParseBool("TRUE")(4)错误处理
所有Parse函数(含Atoi)解析失败时返回NumError类型错误,可提取详细信息:
str := "Not a number"
_, err := strconv.Atoi(str)
if err != nil {
e := err.(*strconv.NumError)
fmt.Println("Func:", e.Function)
fmt.Println("Num:", e.Num)
fmt.Println("Err:", e.Err)
fmt.Println(err)
}输出结果:
Func: Atoi
Num: Not a number
Err: invalid syntax
strconv.Atoi: parsing "Not a number": invalid syntax6.4.2 数字转字符串
(1)整数转字符串
Itoa:便捷版,等效于FormatInt(int64(n), 10)FormatInt:通用版,支持指定基数(2-36)
// Itoa(int转字符串)
str := strconv.Itoa(123)
// FormatInt(指定基数,如二进制)
str2 := strconv.FormatInt(int64(123), 2) // 结果:"1111011"(2)浮点数转字符串
FormatFloat:需指定格式、精度、bitSize,格式参数说明:
f:无指数形式e/E:带指数形式g/G:大数字用指数,否则用无指数b/x/X:二进制/十六进制(特殊场景)
精度参数prec:-1表示自动选择最小位数以完整表示数字。
var v float64 = 123456.123456
var s string
// f格式(无指数)
s = strconv.FormatFloat(v, 'f', -1, 64) // 123456.123456
s = strconv.FormatFloat(v, 'f', 4, 64) // 123456.1235
s = strconv.FormatFloat(v, 'f', 9, 64) // 123456.123456000
// e/E格式(科学计数法)
s = strconv.FormatFloat(v, 'e', -1, 64) // 1.23456123456e+05
s = strconv.FormatFloat(v, 'E', -1, 64) // 1.23456123456E+05
// g/G格式(自适应)
s = strconv.FormatFloat(v, 'g', -1, 64) // 123456.123456
s = strconv.FormatFloat(v, 'g', 4, 64) // 1.235e+056.5 字符串的替换操作
6.5.1 strings.Replace/ReplaceAll
Replace:指定替换次数(-1表示替换所有)ReplaceAll:便捷版,等效于Replace(..., -1)
示例(基于《远大前程》引用):
var quote string = `I loved her against reason, against promise, against peace, against hope, against happiness, against all discouragement that could be.`
// 替换第1个"against"
replaced := strings.Replace(quote, "against", "with", 1)
// 替换前2个"against"
replaced2 := strings.Replace(quote, "against", "with", 2)
// 替换所有"against"(等效于ReplaceAll)
replacedAll := strings.Replace(quote, "against", "with", -1)6.5.2 strings.Replacer(多替换)
适合批量替换场景,创建替换器后可复用:
// 创建替换器:her→him,against→for,all→some
replacer := strings.NewReplacer("her", "him", "against", "for", "all", "some")
replaced := replacer.Replace(quote)
// 输出:I loved him for reason, for promise, for peace, for hope, for happiness, for some discouragement that could be.6.5.3 性能对比
- 单替换场景:
strings.Replace比strings.Replacer更快 - 多替换场景:
strings.Replacer效率更高
基准测试结果参考:
goos: darwin
goarch: arm64
pkg: github.com/sausheong/gocookbook/ch06_string
BenchmarkOneReplace-10 7264310 156.9 ns/op
BenchmarkOneReplacer-10 4336489 276.0 ns/op
BenchmarkReplace-10 2250291 532.1 ns/op
BenchmarkReplacerCreate-10 31878366 37.13 ns/op
BenchmarkReplacer-10 4671319 255.0 ns/op
PASS
ok github.com/sausheong/gocookbook/ch06_string 4.547s6.6 子字符串的提取
6.6.1 切片方式提取
利用字符串的字节切片特性,通过索引切片提取:
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`
// 提取"against reason"(索引12到26)
subStr := quote[12:26]6.6.2 strings.Index定位(避免手动计数)
通过strings.Index获取子串起始索引,结合长度计算结束索引:
// 查找子串起始索引
i := strings.Index(quote, "against reason")
// 计算结束索引(起始索引+子串长度)
j := i + len("against reason")
// 提取子串
subStr := quote[i:j]6.6.3 注意事项
避坑指南:切勿手动计算字符数!UTF-8编码中部分字符(如表情、特殊符号)由多字节组成,必须通过strings.Index+长度的方式定位,避免截取到不完整的字符。
6.7 字符串包含性检查
6.7.1 包含子串
strings.Contains:直接返回布尔值(便捷版)strings.Index:返回子串起始索引(>=0表示包含,-1表示不包含) 两者性能一致,Contains本质是封装了Index的判断逻辑:
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`
// Contains检查
hasAgainst := strings.Contains(quote, "against") // true
// Index检查
idx := strings.Index(quote, "against") // 12(>=0表示包含)6.7.2 前缀/后缀检查
HasPrefix:检查是否以指定前缀开头HasSuffix:检查是否以指定后缀结尾
// 前缀检查
hasPrefix := strings.HasPrefix(quote, "I loved") // true
// 后缀检查
hasSuffix := strings.HasSuffix(quote, "could be.") // true也可手动切片对比(效果一致):
// 手动前缀检查
prefix := "I loved"
if quote[:len(prefix)] == prefix {
// 前缀匹配逻辑
}
// 手动后缀检查
suffix := "could be."
if quote[len(quote)-len(suffix):] == suffix {
// 后缀匹配逻辑
}6.8 字符串的拆分与合并
6.8.1 拆分操作
(1)strings.Split/SplitN/SplitAfter
Split:按指定分隔符拆分所有元素SplitN:指定最大拆分数量(剩余内容合并为最后一个元素)SplitAfter:拆分后保留分隔符在元素末尾
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`
// Split(按空格拆分)
array := strings.Split(quote, " ")
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "\nagainst" "peace," "against" "hope," "against" "happiness," "\nagainst" "all" "discouragement" "that" "could" "be."]
// SplitN(最多拆分10个元素)
arrayN := strings.SplitN(quote, " ", 10)
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "\nagainst" "peace," "against hope, against happiness, \nagainst all discouragement that could be."]
// SplitAfter(保留分隔符)
arrayAfter := strings.SplitAfter(quote, " ")
// 输出:["I " "loved " "her " "against " "reason, " "against " "promise, " "\nagainst " "peace, " "against " "hope, " "against " "happiness, " "\nagainst " "all " "discouragement " "that " "could " "be."](2)智能拆分:strings.Fields/FieldsFunc
Fields:按任意空白字符(多个连续空白视为一个)拆分FieldsFunc:自定义拆分规则(通过函数判断分隔符)
// Fields(按空白拆分,自动合并连续空白)
arrayFields := strings.Fields(quote)
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "against" "peace," "against" "hope," "against" "happiness," "against" "all" "discouragement" "that" "could" "be."]
// FieldsFunc(自定义规则:标点/非字母为分隔符)
f := func(c rune) bool {
return unicode.IsPunct(c) || !unicode.IsLetter(c)
}
arrayFunc := strings.FieldsFunc(quote, f)
// 输出:["I" "loved" "her" "against" "reason" "against" "promise" "against" "peace" "against" "hope" "against" "happiness" "against" "all" "discouragement" "that" "could" "be"]6.8.2 合并操作
strings.Join:将字符串数组按指定分隔符合并为单个字符串(与Split互为逆操作):
array := []string{"I", "loved", "her"}
merged := strings.Join(array, " ") // "I loved her"6.9 字符串的修剪(去除首尾字符)
6.9.1 通用修剪:Trim/TrimLeft/TrimRight
Trim:去除首尾指定字符集中的所有字符TrimLeft:仅去除开头字符TrimRight:仅去除结尾字符
var str string = ",and that is all."
var cutset string = ",. " // 要删除的字符:逗号、句号、空格
// 去除首尾字符
trimmed := strings.Trim(str, cutset) // "and that is all"
// 仅去除开头
trimmedLeft := strings.TrimLeft(str, cutset) // "and that is all."
// 仅去除结尾
trimmedRight := strings.TrimRight(str, cutset) // ",and that is all"6.9.2 前缀/后缀修剪:TrimPrefix/TrimSuffix
去除完整的前缀/后缀子串(而非字符集):
var str string = ",and that is all."
// 去除前缀
trimmedPrefix := strings.TrimPrefix(str, ",and ") // "that is all."
// 去除后缀
trimmedSuffix := strings.TrimSuffix(str, "all.") // ",and that is"6.9.3 空白修剪:TrimSpace
去除首尾所有空白字符(包括\r、\n、\t、空格等):
trimmed := strings.TrimSpace("\r\n\tHello World\t\n\r") // "Hello World"6.9.4 自定义修剪:TrimFunc系列
通过函数自定义修剪规则,支持TrimFunc(首尾)、TrimLeftFunc(开头)、TrimRightFunc(结尾):
var str string = ",and that is all."
// 自定义规则:标点/非字母为要修剪的字符
f := func(c rune) bool {
return unicode.IsPunct(c) || !unicode.IsLetter(c)
}
trimmed := strings.TrimFunc(str, f) // "and that is all"6.10 命令行字符串输入捕获
6.10.1 fmt.Scan(无空格输入)
适合捕获无空格的单个/多个输入(空格分隔多个参数):
package main
import "fmt"
func main() {
// 单个输入
var input string
fmt.Print("Please enter a word: ")
n, err := fmt.Scan(&input)
if err != nil {
fmt.Println("error with user input:", err, n)
} else {
fmt.Println("You entered:", input)
}
// 多个输入(空格分隔)
var input1, input2 string
fmt.Println("Please enter two words: ")
n2, err2 := fmt.Scan(&input1, &input2)
if err2 != nil {
fmt.Println("error with user input:", err2, n2)
} else {
fmt.Println("You entered:", input1, "and", input2)
}
}6.10.2 bufio.Reader(含空格输入)
适合捕获包含空格的完整输入(按换行符结束):
package main
import (
"bufio"
"os"
"fmt"
)
func main() {
reader := bufio.NewReader(os.Stdin) // 包装标准输入
fmt.Print("Please enter many words: ")
input, err := reader.ReadString('\n') // 读取到换行符为止
if err != nil {
fmt.Println("error with user input:", err)
} else {
fmt.Println("You entered:", input)
}
}6.11 HTML字符串的转义/取消转义
依赖html包的EscapeString和UnescapeString函数,用于处理HTML特殊字符(如<、>、&)。
6.11.1 HTML转义
将特殊字符转换为HTML实体:
import "html"
str := "<b>Rock & Roll</b>"
escaped := html.EscapeString(str) // "<b>Rock & Roll</b>"6.11.2 HTML取消转义
将HTML实体还原为原字符:
unescaped := html.UnescapeString(escaped) // "<b>Rock & Roll</b>"6.12 正则表达式处理字符串
依赖regexp包,核心是编译正则表达式后调用Find/Replace系列方法,Go正则遵循RE2语法(无前瞻/后顾特性)。
6.12.1 正则编译
Compile:编译正则,返回错误(推荐)MustCompile:编译正则,失败则panic(便捷版)
import "regexp"
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness
against all discouragement that could be.`
// 编译正则:匹配"against"后接单词/空格
re, err := regexp.Compile(`against[\w\s]+`)
// 或使用MustCompile(无需处理错误)
re := regexp.MustCompile(`against[\w\s]+`)6.12.2 匹配检查
MatchString:检查字符串是否匹配正则,返回布尔值:
isMatch := re.MatchString(quote) // true6.12.3 查找匹配内容
FindString:返回第一个匹配的字符串FindAllString:返回所有匹配的字符串(n=-1表示所有)
// 第一个匹配项
str := re.FindString(quote) // "against reason"
// 所有匹配项
strs := re.FindAllString(quote, -1)
// 输出:[against reason against promise against peace against hope against happiness against all]6.12.4 查找匹配位置
FindStringIndex:返回第一个匹配项的起止索引FindAllStringIndex:返回所有匹配项的起止索引
// 第一个匹配项位置
locs := re.FindStringIndex(quote) // [12 26]
// 所有匹配项位置
allLocs := re.FindAllStringIndex(quote, -1)
// 输出: [[12 26] [28 43] [46 59] [61 73] [75 92] [95 106]]6.12.5 替换匹配内容
ReplaceAllString:将匹配项替换为指定字符串ReplaceAllStringFunc:将匹配项替换为函数返回值
// 直接替换
replaced := re.ReplaceAllString(quote, "anything")
// 输出:I loved her anything, anything, anything, anything, anything, anything, anything discouragement that could be.
// 函数替换(转为大写)
replaced = re.ReplaceAllStringFunc(quote, strings.ToUpper)
// 输出:I loved her AGAINST REASON, AGAINST PROMISE, AGAINST PEACE, AGAINST HOPE, AGAINST HAPPINESS, AGAINST ALL discouragement that could be.
// 自定义函数替换(仅将匹配项第二个单词大写)
f := func(in string) string {
split := strings.Split(in, " ")
split[1] = strings.ToUpper(split[1])
return strings.Join(split, " ")
}
replaced = re.ReplaceAllStringFunc(quote, f)
// 输出:I loved her against REASON, against PROMISE, against PEACE, against HOPE, against HAPPINESS, against ALL discouragement that could be.6.12.6 注意事项
避坑指南:Go的regexp包仅支持RE2语法,不支持PCRE的前瞻(lookahead)、后顾(lookbehind)等特性,编写正则时需注意语法兼容性。
【本篇核心知识点速记】
- Go字符串是只读字节切片,无
char类型,依赖byte(ASCII字符)和rune(UTF-8 Unicode字符)表示字符; - 字符串创建方式:双引号(单行+转义)、反引号(多行原始)、单引号(单个
rune/byte); string↔byte数组转换:直接通过[]byte(str)/string(bytes)完成,底层基于字节切片特性;- 字符串拼接:
+运算符性能最优,fmt系列(尤其是接收任意类型参数)性能最差,strings.Builder适合大量拼接场景; - 数字↔字符串转换依赖
strconv包:Parse/Format系列(通用)、Atoi/Itoa(整数便捷版),转换失败需处理NumError; - 字符串替换:单替换用
strings.Replace/ReplaceAll,多替换用strings.Replacer(高复用场景更高效); - 子串提取:优先使用
strings.Index+长度定位,避免手动计数(UTF-8字符多字节问题); - 包含检查:
strings.Contains(便捷)/strings.Index(性能一致),前缀/后缀用HasPrefix/HasSuffix; - 拆分合并:
Split(指定分隔符)、Fields(空白分隔)、Join(数组转字符串),FieldsFunc支持自定义拆分规则; - 修剪操作:
Trim(自定义字符集)、TrimSpace(空白)、TrimFunc(自定义规则),TrimPrefix/TrimSuffix针对完整子串; - 命令行输入:
fmt.Scan(无空格)、bufio.Reader(含空格); - HTML转义:
html.EscapeString(转义)/html.UnescapeString(取消转义); - 正则处理:
Compile/MustCompile编译正则,Find/Replace系列处理匹配,注意RE2语法不支持前瞻/后顾。
