suffixarray

package suffixarray

import "index/suffixarray"

Package suffixarray 使用内存中的后缀数组实现对数时间复杂度的子字符串搜索。

示例用法：

// 为一些数据创建索引
index := suffixarray.New(data)

// 查找字节切片 s
offsets1 := index.Lookup(s, -1) // s 在 data 中出现的所有索引列表
offsets2 := index.Lookup(s, 3)  // s 在 data 中最多出现的 3 个索引列表

Index

type Index

Examples

Index.Lookup

Types

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index 实现了一个用于快速子字符串搜索的后缀数组。

func New

func New(data []byte) *Index

New 为 data 创建一个新的 Index。 Index 的创建时间为 O(N)，其中 N = len(data)。

func (*Index) Bytes

func (x *Index) Bytes() []byte

Bytes 返回创建索引所基于的数据。它不能被修改。

func (*Index) FindAllIndex

func (x *Index) FindAllIndex(r *regexp.Regexp, n int) (result [][]int)

FindAllIndex 返回正则表达式 r 的非重叠匹配的有序列表，其中匹配是一对索引，指定匹配到的 x.Bytes() 的切片。如果 n < 0，则按顺序返回所有匹配。否则，最多返回 n 个匹配，它们可能不是连续的。如果没有匹配或 n == 0，结果为 nil。

func (*Index) Lookup

func (x *Index) Lookup(s []byte, n int) (result []int)

Lookup 返回字节字符串 s 在索引数据中出现最多 n 个索引的无序列表。如果 n < 0，返回所有出现的位置。如果 s 为空、未找到或 n == 0，结果为 nil。查找时间为 O(log(N)*len(s) + len(result))，其中 N 是索引数据的大小。

Example

package main

import (
	"fmt"
	"index/suffixarray"
)

func main() {
	index := suffixarray.New([]byte("banana"))
	offsets := index.Lookup([]byte("ana"), -1)
	for _, off := range offsets {
		fmt.Println(off)
	}

}

Output:

1
3

func (*Index) Read

func (x *Index) Read(r io.Reader) error

Read 将索引从 r 读入 x；x 不能为 nil。

func (*Index) Write

func (x *Index) Write(w io.Writer) error

Write 将索引 x 写入 w。