[−][src]Crate jetscii
A tiny library to efficiently search strings for sets of ASCII characters or byte slices for sets of bytes.
Examples
Searching for a set of ASCII characters
#[macro_use] extern crate jetscii; fn main() { let part_number = "86-J52:rev1"; let first = ascii_chars!('-', ':').find(part_number); assert_eq!(first, Some(2)); }
Searching for a set of bytes
#[macro_use] extern crate jetscii; fn main() { let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42]; let first = bytes!(0x01, 0x10).find(&raw_data); assert_eq!(first, Some(1)); }
Searching for a substring
use jetscii::Substring; let colors = "red, blue, green"; let first = Substring::new(", ").find(colors); assert_eq!(first, Some(3));
Searching for a subslice
use jetscii::ByteSubstring; let raw_data = [0x00, 0x01, 0x10, 0xFF, 0x42]; let first = ByteSubstring::new(&[0x10, 0xFF]).find(&raw_data); assert_eq!(first, Some(2));
Using the pattern API
If this crate is compiled with the unstable pattern
feature
flag, [AsciiChars
] will implement the
[Pattern
][std::str::pattern::Pattern] trait, allowing it to be
used with many traditional methods.
#[macro_use] extern crate jetscii; fn main() { let part_number = "86-J52:rev1"; let parts: Vec<_> = part_number.split(ascii_chars!('-', ':')).collect(); assert_eq!(&parts, &["86", "J52", "rev1"]); }
use jetscii::Substring; let colors = "red, blue, green"; let colors: Vec<_> = colors.split(Substring::new(", ")).collect(); assert_eq!(&colors, &["red", "blue", "green"]);
What's so special about this library?
We use a particular set of x86-64 SSE 4.2 instructions (PCMPESTRI
and PCMPESTRM
) to gain great speedups. This method stays fast even
when searching for a byte in a set of up to 16 choices.
When the PCMPxSTRx
instructions are not available, we fall back to
reasonably fast but universally-supported methods.
Benchmarks
These numbers come from running on my personal laptop; always benchmark with data and machines similar to your own.
Single character
Searching a 5MiB string of a
s with a single space at the end for a space:
Method | Speed |
---|---|
ascii_chars!(' ').find(s) | 11504 MB/s |
s.as_bytes().iter().position(|&c| c == b' ') | 2377 MB/s |
s.find(" ") | 2149 MB/s |
s.find(&[' '][..]) | 1151 MB/s |
s.find(' ') | 14600 MB/s |
s.find(|c| c == ' ') | 1080 MB/s |
Set of 3 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, and &
:
Method | Speed |
---|---|
ascii_chars!(/* ... */).find(s) | 11513 MB/s |
s.as_bytes().iter().position(|&c| /* ... */) | 1644 MB/s |
s.find(&[/* ... */][..]) | 1079 MB/s |
s.find(|c| /* ... */)) | 1084 MB/s |
Set of 5 characters
Searching a 5MiB string of a
s with a single ampersand at the end for <
, >
, &
, '
, and "
:
Method | Speed |
---|---|
ascii_chars!(/* ... */).find(s) | 11504 MB/s |
s.as_bytes().iter().position(|&c| /* ... */) | 812 MB/s |
s.find(&[/* ... */][..])) | 538 MB/s |
s.find(|c| /* ... */) | 1082 MB/s |
Substring
Searching a 5MiB string of a
s with the string "xyzzy" at the end for "xyzzy":
Method | Speed |
---|---|
Substring::new("xyzzy").find(s) | 11475 MB/s |
s.find("xyzzy") | 5391 MB/s |
Macros
ascii_chars | A convenience constructor for an [ |
bytes | A convenience constructor for a [ |
Structs
AsciiChars | Searches a string for a set of ASCII characters. Up to 16 characters may be used. |
ByteSubstring | Searches a slice for the first occurence of the subslice. |
Bytes | Searches a slice for a set of bytes. Up to 16 bytes may be used. |
Substring | Searches a string for the first occurence of the substring. |
Type Definitions
AsciiCharsConst | A convenience type that can be used in a constant or static. |
ByteSubstringConst | A convenience type that can be used in a constant or static. |
BytesConst | A convenience type that can be used in a constant or static. |
SubstringConst | A convenience type that can be used in a constant or static. |