Skip to content

《Go Cookbook CN》系列 06:字符串创建、转换与核心操作全攻略

约 4588 字大约 15 分钟

《Go Cookbook CN》系列Go语言

2026-04-02

本文聚焦Go语言字符串的核心操作体系,从字符串的创建方式、类型转换逻辑,到拼接、数字互转、替换、子串提取等高频场景落地,全方位拆解Go字符串的底层特性与实操方法。学完本文后,你将彻底掌握Go字符串的核心操作技巧,能够高效处理各类字符串场景需求。

【本篇核心收获】

  • 掌握Go语言中字符串的3种创建方式,理解byte与rune的底层区别
  • 熟练完成string与byte数组的双向转换,掌握多种字符串拼接方式及性能差异
  • 精通字符串与数字的互转方法,能处理转换过程中的错误场景
  • 掌握字符串替换、子串提取、包含检查、拆分合并、修剪的核心函数用法
  • 学会命令行字符串输入捕获、HTML转义及正则表达式处理字符串的实战技巧

6.0 字符串核心认知铺垫

6.0.1 字符串的底层本质

字符串操作是Go语言开发中最常见的任务之一,无论是用户交互还是机器通信,都离不开字符串处理。Go语言提供了多个专用包支撑字符串操作:

  • strconv:专注字符串与其他类型的转换
  • fmt:通过占位符格式化字符串(类似C语言)
  • unicode/utf8/unicode/utf16:处理Unicode编码字符串
  • strings:包含绝大多数字符串常用操作函数

在Go语言中,字符串是只读的字节切片,可包含任意字节(无需遵循特定编码),这与其他语言中“字符序列”的定义不同。Go中无专门的char数据类型,而是通过两种类型表示字符:

  • byteuint8的别名,对应ASCII字符(单字节)
  • runeint32的别名,对应UTF-8编码的Unicode字符(多字节)

需要注意:索引字符串时获取的是字节而非字符,一个Unicode字符可能由多个字节/码点组成。

6.1 字符串的创建方式

6.1.1 单行字符串(双引号)

使用双引号""创建单行字符串,支持转义字符(如\n换行、\"转义双引号):

// 基础单行字符串
var str = "A simple string"
// 包含换行转义符
var strWithNewline = "A simple string\n"
// 包含转义双引号
var strWithQuote = "A \"simple\" string"

6.1.2 多行原始字符串(反引号)

使用反引号``创建“原始字符串”,忽略所有转义字符,支持多行编写(双引号多行会报语法错误):

// 错误示例:双引号无法直接创建多行字符串
// var str = "
// A simple string
// "

// 正确示例:反引号创建多行原始字符串
var str = `
A simple string
`

反引号内的内容会被编译器原样保留,不做任何格式处理。

6.1.3 单个字符(单引号)

单引号用于创建单个字符,默认类型为runeint32),可显式指定为byte

// 默认类型为rune(int32)
var c = 'A'
// 显式指定为byte类型
var cByte byte = 'A'

6.2 String与Byte的相互转换

Go字符串本质是byte切片,可通过类型转换直接完成双向转换,无额外函数依赖。

6.2.1 字符串转Byte数组

str := "This is a simple string"  // 定义字符串变量
bytes := []byte(str)              // 转换为byte数组

6.2.2 Byte数组转字符串

// 手动定义byte数组(对应"This is a simple string")
bytes := []byte{84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 105, 109, 112, 108, 101, 32, 115, 116, 114, 105, 110, 103}
str := string(bytes)  // 转换为字符串

6.3 字符串的拼接与创建(从其他数据/字符串创建)

从已有字符串或数据创建新字符串有多种方式,不同方式性能差异显著。

6.3.1 直接拼接(+运算符)

最简单的方式,性能最优:

var str string = "The time is " + time.Now().Format(time.Kitchen) + " now."

拼接结果示例:The time is 5:28PM now.

6.3.2 strings.Join函数

接收字符串数组和分隔符,将数组元素拼接为单个字符串:

var str string = strings.Join([]string{"The time is", time.Now().Format(time.Kitchen), "now."}, "")

6.3.3 fmt.Sprint/Sprintf

  • fmt.Sprint:接收任意类型参数,直接拼接为字符串
  • fmt.Sprintf:通过格式占位符(如%v)格式化拼接,更灵活
// fmt.Sprint
var str1 string = fmt.Sprint("The time is ", time.Now().Format(time.Kitchen), " now.")
// fmt.Sprintf(格式占位符)
var str2 string = fmt.Sprintf("The time is %v now.", time.Now())

6.3.4 strings.Builder

适合大量数据拼接场景,需分步写入后提取结果:

// 方式1:WriteString写入字符串
var builder1 strings.Builder
builder1.WriteString("The time is ")
builder1.WriteString(time.Now().Format(time.Kitchen))
builder1.WriteString(" now.")
var str3 string = builder1.String()

// 方式2:fmt.Fprint写入任意类型
var builder2 strings.Builder
fmt.Fprint(&builder2, "The time is ")
fmt.Fprint(&builder2, time.Now())
fmt.Fprint(&builder2, " now.")
var str4 string = builder2.String()

strings.Builder还支持Write(byte数组)、WriteByte(单个字节)、WriteRune(单个rune)方法。

6.3.5 各方式性能对比

以下是基准测试代码及结果,结论:直接拼接(+)效率最高,fmt系列(尤其是接收任意类型参数时)效率最低。

基准测试代码

package string

import (
    "fmt"
    "strings"
    "testing"
    "time"
)

func BenchmarkStringConcat(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = "The time is " + time.Now().Format(time.Kitchen) + " now."
    }
}

func BenchmarkStringJoin(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = strings.Join([]string{"The time is", time.Now().Format(time.Kitchen), "now."}, "")
    }
}

func BenchmarkStringSprint(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = fmt.Sprint("The time is ", time.Now().Format(time.Kitchen), "now.")
    }
}

func BenchmarkStringSprintDiff(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = fmt.Sprint("The time is ", time.Now(), " now.")
    }
}

func BenchmarkStringSprintf(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = fmt.Sprintf("The time is %v now.", time.Now().Format(time.Kitchen))
    }
}

func BenchmarkStringSprintfDiff(b *testing.B) {
    for i := 0; i < b.N; i++ {
        _ = fmt.Sprintf("The time is %s now.", time.Now())
    }
}

func BenchmarkStringBuilderFprint(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var builder strings.Builder
        fmt.Fprint(&builder, "The time is ")
        fmt.Fprint(&builder, time.Now())
        fmt.Fprint(&builder, " now.")
        _ = builder.String()
    }
}

func BenchmarkStringBuilderWriteString(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var builder strings.Builder
        builder.WriteString("The time is ")
        builder.WriteString(time.Now().Format(time.Kitchen))
        builder.WriteString(" now.")
        _ = builder.String()
    }
}

测试命令

go test -bench=BenchmarkString

测试结果

goos: darwin
goarch: arm64
pkg: github.com/sausheong/gocookbook/ch06_string
BenchmarkStringConcat-10                5787976               206.7 ns/op
BenchmarkStringJoin-10                  5121637               235.0 ns/op
BenchmarkStringSprint-10                3680838               323.8 ns/op
BenchmarkStringSprintDiff-10            1541514               779.9 ns/op
BenchmarkStringSprintf-10               4032438               297.8 ns/op
BenchmarkStringSprintfDiff-10           1610212               740.9 ns/op
BenchmarkStringBuilderFprint-10         2580783               464.2 ns/op
BenchmarkStringBuilderWriteString-10    4866556               247.0 ns/op
PASS
ok      github.com/sausheong/gocookbook/ch06_string 13.025s

6.4 字符串与数字的互转

依赖strconv包,核心分为两组函数:

  • Parse系列:字符串转数字(读取字符串)
  • Format系列:数字转字符串(创建字符串)

6.4.1 字符串转数字

(1)转整数

  • Atoi:便捷版,等效于ParseInt(..., 10, 0)
  • ParseInt:通用版,支持指定基数(2-36)和bitSize(0-64)
// Atoi(字符串转int)
i, err := strconv.Atoi("123")
// ParseInt(支持有符号/无符号,指定基数和位大小)
i2, err := strconv.ParseInt("123", 10, 0)

(2)转浮点数

ParseFloat:指定bitSize(32=float32,64=float64),返回值始终为float64:

f, err := strconv.ParseFloat("1.234", 64)

(3)转布尔值

ParseBool:支持1/t/T/TRUE/true/True(真)、0/f/F/FALSE/false/False(假):

b, err := strconv.ParseBool("TRUE")

(4)错误处理

所有Parse函数(含Atoi)解析失败时返回NumError类型错误,可提取详细信息:

str := "Not a number"
_, err := strconv.Atoi(str)
if err != nil {
    e := err.(*strconv.NumError)
    fmt.Println("Func:", e.Function)
    fmt.Println("Num:", e.Num)
    fmt.Println("Err:", e.Err)
    fmt.Println(err)
}

输出结果:

Func: Atoi
Num: Not a number
Err: invalid syntax
strconv.Atoi: parsing "Not a number": invalid syntax

6.4.2 数字转字符串

(1)整数转字符串

  • Itoa:便捷版,等效于FormatInt(int64(n), 10)
  • FormatInt:通用版,支持指定基数(2-36)
// Itoa(int转字符串)
str := strconv.Itoa(123)
// FormatInt(指定基数,如二进制)
str2 := strconv.FormatInt(int64(123), 2) // 结果:"1111011"

(2)浮点数转字符串

FormatFloat:需指定格式、精度、bitSize,格式参数说明:

  • f:无指数形式
  • e/E:带指数形式
  • g/G:大数字用指数,否则用无指数
  • b/x/X:二进制/十六进制(特殊场景)

精度参数prec:-1表示自动选择最小位数以完整表示数字。

var v float64 = 123456.123456
var s string

// f格式(无指数)
s = strconv.FormatFloat(v, 'f', -1, 64)  // 123456.123456
s = strconv.FormatFloat(v, 'f', 4, 64)   // 123456.1235
s = strconv.FormatFloat(v, 'f', 9, 64)   // 123456.123456000

// e/E格式(科学计数法)
s = strconv.FormatFloat(v, 'e', -1, 64)  // 1.23456123456e+05
s = strconv.FormatFloat(v, 'E', -1, 64)  // 1.23456123456E+05

// g/G格式(自适应)
s = strconv.FormatFloat(v, 'g', -1, 64)  // 123456.123456
s = strconv.FormatFloat(v, 'g', 4, 64)   // 1.235e+05

6.5 字符串的替换操作

6.5.1 strings.Replace/ReplaceAll

  • Replace:指定替换次数(-1表示替换所有)
  • ReplaceAll:便捷版,等效于Replace(..., -1)

示例(基于《远大前程》引用):

var quote string = `I loved her against reason, against promise, against peace, against hope, against happiness, against all discouragement that could be.`

// 替换第1个"against"
replaced := strings.Replace(quote, "against", "with", 1)
// 替换前2个"against"
replaced2 := strings.Replace(quote, "against", "with", 2)
// 替换所有"against"(等效于ReplaceAll)
replacedAll := strings.Replace(quote, "against", "with", -1)

6.5.2 strings.Replacer(多替换)

适合批量替换场景,创建替换器后可复用:

// 创建替换器:her→him,against→for,all→some
replacer := strings.NewReplacer("her", "him", "against", "for", "all", "some")
replaced := replacer.Replace(quote)
// 输出:I loved him for reason, for promise, for peace, for hope, for happiness, for some discouragement that could be.

6.5.3 性能对比

  • 单替换场景:strings.Replacestrings.Replacer更快
  • 多替换场景:strings.Replacer效率更高

基准测试结果参考:

goos: darwin
goarch: arm64
pkg: github.com/sausheong/gocookbook/ch06_string
BenchmarkOneReplace-10        7264310               156.9 ns/op
BenchmarkOneReplacer-10       4336489               276.0 ns/op
BenchmarkReplace-10           2250291               532.1 ns/op
BenchmarkReplacerCreate-10   31878366               37.13 ns/op
BenchmarkReplacer-10          4671319               255.0 ns/op
PASS
ok      github.com/sausheong/gocookbook/ch06_string 4.547s

6.6 子字符串的提取

6.6.1 切片方式提取

利用字符串的字节切片特性,通过索引切片提取:

var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`

// 提取"against reason"(索引12到26)
subStr := quote[12:26]

6.6.2 strings.Index定位(避免手动计数)

通过strings.Index获取子串起始索引,结合长度计算结束索引:

// 查找子串起始索引
i := strings.Index(quote, "against reason")
// 计算结束索引(起始索引+子串长度)
j := i + len("against reason")
// 提取子串
subStr := quote[i:j]

6.6.3 注意事项

避坑指南:切勿手动计算字符数!UTF-8编码中部分字符(如表情、特殊符号)由多字节组成,必须通过strings.Index+长度的方式定位,避免截取到不完整的字符。

6.7 字符串包含性检查

6.7.1 包含子串

  • strings.Contains:直接返回布尔值(便捷版)
  • strings.Index:返回子串起始索引(>=0表示包含,-1表示不包含) 两者性能一致,Contains本质是封装了Index的判断逻辑:
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`

// Contains检查
hasAgainst := strings.Contains(quote, "against") // true
// Index检查
idx := strings.Index(quote, "against") // 12(>=0表示包含)

6.7.2 前缀/后缀检查

  • HasPrefix:检查是否以指定前缀开头
  • HasSuffix:检查是否以指定后缀结尾
// 前缀检查
hasPrefix := strings.HasPrefix(quote, "I loved") // true
// 后缀检查
hasSuffix := strings.HasSuffix(quote, "could be.") // true

也可手动切片对比(效果一致):

// 手动前缀检查
prefix := "I loved"
if quote[:len(prefix)] == prefix {
    // 前缀匹配逻辑
}

// 手动后缀检查
suffix := "could be."
if quote[len(quote)-len(suffix):] == suffix {
    // 后缀匹配逻辑
}

6.8 字符串的拆分与合并

6.8.1 拆分操作

(1)strings.Split/SplitN/SplitAfter

  • Split:按指定分隔符拆分所有元素
  • SplitN:指定最大拆分数量(剩余内容合并为最后一个元素)
  • SplitAfter:拆分后保留分隔符在元素末尾
var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness,
against all discouragement that could be.`

// Split(按空格拆分)
array := strings.Split(quote, " ")
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "\nagainst" "peace," "against" "hope," "against" "happiness," "\nagainst" "all" "discouragement" "that" "could" "be."]

// SplitN(最多拆分10个元素)
arrayN := strings.SplitN(quote, " ", 10)
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "\nagainst" "peace," "against hope, against happiness, \nagainst all discouragement that could be."]

// SplitAfter(保留分隔符)
arrayAfter := strings.SplitAfter(quote, " ")
// 输出:["I " "loved " "her " "against " "reason, " "against " "promise, " "\nagainst " "peace, " "against " "hope, " "against " "happiness, " "\nagainst " "all " "discouragement " "that " "could " "be."]

(2)智能拆分:strings.Fields/FieldsFunc

  • Fields:按任意空白字符(多个连续空白视为一个)拆分
  • FieldsFunc:自定义拆分规则(通过函数判断分隔符)
// Fields(按空白拆分,自动合并连续空白)
arrayFields := strings.Fields(quote)
// 输出:["I" "loved" "her" "against" "reason," "against" "promise," "against" "peace," "against" "hope," "against" "happiness," "against" "all" "discouragement" "that" "could" "be."]

// FieldsFunc(自定义规则:标点/非字母为分隔符)
f := func(c rune) bool {
    return unicode.IsPunct(c) || !unicode.IsLetter(c)
}
arrayFunc := strings.FieldsFunc(quote, f)
// 输出:["I" "loved" "her" "against" "reason" "against" "promise" "against" "peace" "against" "hope" "against" "happiness" "against" "all" "discouragement" "that" "could" "be"]

6.8.2 合并操作

strings.Join:将字符串数组按指定分隔符合并为单个字符串(与Split互为逆操作):

array := []string{"I", "loved", "her"}
merged := strings.Join(array, " ") // "I loved her"

6.9 字符串的修剪(去除首尾字符)

6.9.1 通用修剪:Trim/TrimLeft/TrimRight

  • Trim:去除首尾指定字符集中的所有字符
  • TrimLeft:仅去除开头字符
  • TrimRight:仅去除结尾字符
var str string = ",and that is all."
var cutset string = ",. " // 要删除的字符:逗号、句号、空格

// 去除首尾字符
trimmed := strings.Trim(str, cutset) // "and that is all"
// 仅去除开头
trimmedLeft := strings.TrimLeft(str, cutset) // "and that is all."
// 仅去除结尾
trimmedRight := strings.TrimRight(str, cutset) // ",and that is all"

6.9.2 前缀/后缀修剪:TrimPrefix/TrimSuffix

去除完整的前缀/后缀子串(而非字符集):

var str string = ",and that is all."

// 去除前缀
trimmedPrefix := strings.TrimPrefix(str, ",and ") // "that is all."
// 去除后缀
trimmedSuffix := strings.TrimSuffix(str, "all.") // ",and that is"

6.9.3 空白修剪:TrimSpace

去除首尾所有空白字符(包括\r\n\t、空格等):

trimmed := strings.TrimSpace("\r\n\tHello World\t\n\r") // "Hello World"

6.9.4 自定义修剪:TrimFunc系列

通过函数自定义修剪规则,支持TrimFunc(首尾)、TrimLeftFunc(开头)、TrimRightFunc(结尾):

var str string = ",and that is all."
// 自定义规则:标点/非字母为要修剪的字符
f := func(c rune) bool {
    return unicode.IsPunct(c) || !unicode.IsLetter(c)
}
trimmed := strings.TrimFunc(str, f) // "and that is all"

6.10 命令行字符串输入捕获

6.10.1 fmt.Scan(无空格输入)

适合捕获无空格的单个/多个输入(空格分隔多个参数):

package main
import "fmt"

func main() {
    // 单个输入
    var input string
    fmt.Print("Please enter a word: ")
    n, err := fmt.Scan(&input)
    if err != nil {
        fmt.Println("error with user input:", err, n)
    } else {
        fmt.Println("You entered:", input)
    }

    // 多个输入(空格分隔)
    var input1, input2 string
    fmt.Println("Please enter two words: ")
    n2, err2 := fmt.Scan(&input1, &input2)
    if err2 != nil {
        fmt.Println("error with user input:", err2, n2)
    } else {
        fmt.Println("You entered:", input1, "and", input2)
    }
}

6.10.2 bufio.Reader(含空格输入)

适合捕获包含空格的完整输入(按换行符结束):

package main
import (
    "bufio"
    "os"
    "fmt"
)

func main() {
    reader := bufio.NewReader(os.Stdin) // 包装标准输入
    fmt.Print("Please enter many words: ")
    input, err := reader.ReadString('\n') // 读取到换行符为止
    if err != nil {
        fmt.Println("error with user input:", err)
    } else {
        fmt.Println("You entered:", input)
    }
}

6.11 HTML字符串的转义/取消转义

依赖html包的EscapeStringUnescapeString函数,用于处理HTML特殊字符(如<>&)。

6.11.1 HTML转义

将特殊字符转换为HTML实体:

import "html"

str := "<b>Rock & Roll</b>"
escaped := html.EscapeString(str) // "&lt;b&gt;Rock &amp; Roll&lt;/b&gt;"

6.11.2 HTML取消转义

将HTML实体还原为原字符:

unescaped := html.UnescapeString(escaped) // "<b>Rock & Roll</b>"

6.12 正则表达式处理字符串

依赖regexp包,核心是编译正则表达式后调用Find/Replace系列方法,Go正则遵循RE2语法(无前瞻/后顾特性)。

6.12.1 正则编译

  • Compile:编译正则,返回错误(推荐)
  • MustCompile:编译正则,失败则panic(便捷版)
import "regexp"

var quote string = `I loved her against reason, against promise,
against peace, against hope, against happiness
against all discouragement that could be.`

// 编译正则:匹配"against"后接单词/空格
re, err := regexp.Compile(`against[\w\s]+`)
// 或使用MustCompile(无需处理错误)
re := regexp.MustCompile(`against[\w\s]+`)

6.12.2 匹配检查

MatchString:检查字符串是否匹配正则,返回布尔值:

isMatch := re.MatchString(quote) // true

6.12.3 查找匹配内容

  • FindString:返回第一个匹配的字符串
  • FindAllString:返回所有匹配的字符串(n=-1表示所有)
// 第一个匹配项
str := re.FindString(quote) // "against reason"
// 所有匹配项
strs := re.FindAllString(quote, -1)
// 输出:[against reason against promise against peace against hope against happiness against all]

6.12.4 查找匹配位置

  • FindStringIndex:返回第一个匹配项的起止索引
  • FindAllStringIndex:返回所有匹配项的起止索引
// 第一个匹配项位置
locs := re.FindStringIndex(quote) // [12 26]
// 所有匹配项位置
allLocs := re.FindAllStringIndex(quote, -1)
// 输出: [[12 26] [28 43] [46 59] [61 73] [75 92] [95 106]]

6.12.5 替换匹配内容

  • ReplaceAllString:将匹配项替换为指定字符串
  • ReplaceAllStringFunc:将匹配项替换为函数返回值
// 直接替换
replaced := re.ReplaceAllString(quote, "anything")
// 输出:I loved her anything, anything, anything, anything, anything, anything, anything discouragement that could be.

// 函数替换(转为大写)
replaced = re.ReplaceAllStringFunc(quote, strings.ToUpper)
// 输出:I loved her AGAINST REASON, AGAINST PROMISE, AGAINST PEACE, AGAINST HOPE, AGAINST HAPPINESS, AGAINST ALL discouragement that could be.

// 自定义函数替换(仅将匹配项第二个单词大写)
f := func(in string) string {
    split := strings.Split(in, " ")
    split[1] = strings.ToUpper(split[1])
    return strings.Join(split, " ")
}
replaced = re.ReplaceAllStringFunc(quote, f)
// 输出:I loved her against REASON, against PROMISE, against PEACE, against HOPE, against HAPPINESS, against ALL discouragement that could be.

6.12.6 注意事项

避坑指南:Go的regexp包仅支持RE2语法,不支持PCRE的前瞻(lookahead)、后顾(lookbehind)等特性,编写正则时需注意语法兼容性。

【本篇核心知识点速记】

  1. Go字符串是只读字节切片,无char类型,依赖byte(ASCII字符)和rune(UTF-8 Unicode字符)表示字符;
  2. 字符串创建方式:双引号(单行+转义)、反引号(多行原始)、单引号(单个rune/byte);
  3. stringbyte数组转换:直接通过[]byte(str)/string(bytes)完成,底层基于字节切片特性;
  4. 字符串拼接:+运算符性能最优,fmt系列(尤其是接收任意类型参数)性能最差,strings.Builder适合大量拼接场景;
  5. 数字↔字符串转换依赖strconv包:Parse/Format系列(通用)、Atoi/Itoa(整数便捷版),转换失败需处理NumError
  6. 字符串替换:单替换用strings.Replace/ReplaceAll,多替换用strings.Replacer(高复用场景更高效);
  7. 子串提取:优先使用strings.Index+长度定位,避免手动计数(UTF-8字符多字节问题);
  8. 包含检查:strings.Contains(便捷)/strings.Index(性能一致),前缀/后缀用HasPrefix/HasSuffix
  9. 拆分合并:Split(指定分隔符)、Fields(空白分隔)、Join(数组转字符串),FieldsFunc支持自定义拆分规则;
  10. 修剪操作:Trim(自定义字符集)、TrimSpace(空白)、TrimFunc(自定义规则),TrimPrefix/TrimSuffix针对完整子串;
  11. 命令行输入:fmt.Scan(无空格)、bufio.Reader(含空格);
  12. HTML转义:html.EscapeString(转义)/html.UnescapeString(取消转义);
  13. 正则处理:Compile/MustCompile编译正则,Find/Replace系列处理匹配,注意RE2语法不支持前瞻/后顾。