Reference#
Pattern #
Pattern to search matches for.
Source code in python/pyperscan/_pyperscan.pyi
10 11 12 13 14 15 16 17 18 19 20 21 |
|
__new__ #
__new__(expression, *flags, tag=None)
Construct a new search pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression |
bytes
|
Regular expression. |
required |
flags |
Flag
|
modify expression matching behavior. |
()
|
tag |
Any
|
Python object to pass to callback when match succeeds. If unset, the pattern index is used. |
None
|
Source code in python/pyperscan/_pyperscan.pyi
13 14 15 16 17 18 19 20 21 |
|
Flag #
Pattern compile flags.
Source code in python/pyperscan/_pyperscan.pyi
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
|
CASELESS
class-attribute
instance-attribute
#
CASELESS: Flag = Ellipsis
Set case-insensitive matching.
This flag sets the expression to be matched case-insensitively by default. The expression may still use PCRE tokens (notably (?i) and (?-i)) to switch case-insensitive matching on and off.
DOTALL
class-attribute
instance-attribute
#
DOTALL: Flag = Ellipsis
Matching a .
will not exclude newlines.
This flag sets any instances of the .
token to match newline characters as well as
all other characters. The PCRE specification states that the .
token does not
match newline characters by default, so without this flag the .
token will not
cross line boundaries.
See HS_FLAGS_DOTALL
MULTILINE
class-attribute
instance-attribute
#
MULTILINE: Flag = Ellipsis
Set multi-line anchoring.
This flag instructs the expression to make the ^
and $
tokens match newline
characters as well as the start and end of the stream. If this flag is not
specified, the ^
token will only ever match at the start of a stream, and the $
token will only ever match at the end of a stream within the guidelines of the PCRE
specification.
SINGLEMATCH
class-attribute
instance-attribute
#
SINGLEMATCH: Flag = Ellipsis
Set single-match only mode.
This flag sets the expression's match ID to match at most once. In streaming mode,
this means that the expression will return only a single match over the lifetime of
the stream, rather than reporting every match as per standard Hyperscan semantics.
In block mode or vectored mode, only the first match for each invocation of
scan()
will be returned.
If multiple expressions in the database share the same match ID, then they either
must all specify SINGLEMATCH
. or none of them specify SINGLEMATCH
.
If a group of expressions sharing a match ID specify the flag, then at most one
match with the match ID will be generated per stream.
Note: The use of this flag in combination with SOM_LEFTMOST
is not currently
supported.
ALLOWEMPTY
class-attribute
instance-attribute
#
ALLOWEMPTY: Flag = Ellipsis
Allow expressions that can match against empty buffers.
This flag instructs the compiler to allow expressions that can match against empty
buffers, such as .?
, .*
, (a|)
. Since Hyperscan can return every possible
match for an expression, such expressions generally execute very slowly; the default
behaviour is to return an error when an attempt to compile one is made. Using this
flag will force the compiler to allow such an expression.
UTF8
class-attribute
instance-attribute
#
UTF8: Flag = Ellipsis
Enable UTF-8 mode for this expression.
This flag instructs Hyperscan to treat the pattern as a sequence of UTF-8 characters. The results of scanning invalid UTF-8 sequences with a Hyperscan library that has been compiled with one or more patterns using this flag are undefined.
See HS_FLAGS_UTF8
UCP
class-attribute
instance-attribute
#
UCP: Flag = Ellipsis
Enable Unicode property support for this expression.
This flag instructs Hyperscan to use Unicode properties, rather than the default ASCII interpretations, for character mnemonics like \w and \s as well as the POSIX character classes. It is only meaningful in conjunction with UTF8.
See HS_FLAGS_UCP
PREFILTER
class-attribute
instance-attribute
#
PREFILTER: Flag = Ellipsis
Enable prefiltering mode for this expression.
This flag instructs Hyperscan to compile an “approximate” version of this pattern for use in a prefiltering application, even if Hyperscan does not support the pattern in normal operation.
The set of matches returned when this flag is used is guaranteed to be a superset of the matches specified by the non-prefiltering expression.
If the pattern contains pattern constructs not supported by Hyperscan (such as zero-width assertions, back-references or conditional references) these constructs will be replaced internally with broader constructs that may match more often.
Furthermore, in prefiltering mode Hyperscan may simplify a pattern that would otherwise return a “Pattern too large” error at compile time, or for performance reasons (subject to the matching guarantee above).
It is generally expected that the application will subsequently confirm prefilter matches with another regular expression matcher that can provide exact matches for the pattern.
Note: The use of this flag in combination with SOM_LEFTMOST is not currently supported.
SOM_LEFTMOST
class-attribute
instance-attribute
#
SOM_LEFTMOST: Flag = Ellipsis
Enable leftmost start of match reporting.
This flag instructs Hyperscan to report the leftmost possible start of match offset when a match is reported for this expression. (By default, no start of match is returned.)
For all the 3 modes, enabling this behaviour may reduce performance. And particularly, it may increase stream state requirements in streaming mod
COMBINATION
class-attribute
instance-attribute
#
COMBINATION: Flag = Ellipsis
Logical combination.
This flag instructs Hyperscan to parse this expression as logical combination
syntax. Logical constraints consist of operands, operators and parentheses. The
operands are expression indices, and operators can be !
(NOT), &
(AND) or |
(OR). For example: (101&102&103)|(104&!105)
((301|302)&303)&(304|305)
QUIET
class-attribute
instance-attribute
#
QUIET: Flag = Ellipsis
Don't do any match reporting.
This flag instructs Hyperscan to ignore match reporting for this expression. It is designed to be used on the sub-expressions in logical combinations.
See HS_FLAGS_QUIET
OnMatch #
Bases: Protocol
, Generic[_TContext_contra]
Callback called on match.
Source code in python/pyperscan/_pyperscan.pyi
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
__call__ #
__call__(context, tag, start, end)
Called when a match happens.
Note
Call parameters are passed positonally.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
context |
_TContext_contra
|
Object passed to Database.build |
required |
tag |
Any
|
Pattern.tag of the pattern matched. |
required |
start |
int
|
start index of the matched pattern. |
required |
end |
Any
|
end index of the matched pattern. |
required |
Returns:
Type | Description |
---|---|
Scan
|
Instructs Hyperscan wether to continue or stop searching for subsequent matches. |
Source code in python/pyperscan/_pyperscan.pyi
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
Database #
Bases: Generic[_TScanner]
A Hyperscan pattern database.
Source code in python/pyperscan/_pyperscan.pyi
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 |
|
__new__ #
__new__(*patterns)
Compiles a Hyperscan pattern database.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
patterns |
Pattern
|
Expressions to compile into the database to later match against. |
()
|
Note
Calls hs_compile_ext_multi internally.
Source code in python/pyperscan/_pyperscan.pyi
199 200 201 202 203 204 205 206 207 208 |
|
build #
build(context, on_match)
Build a scanner object that is usable to search for pattern procurances.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
context |
_TContext_contra
|
arbitrary object which is passed as a first parameter to |
required |
on_match |
OnMatch[_TContext_contra]
|
callable to call when a match happens upon |
required |
Source code in python/pyperscan/_pyperscan.pyi
209 210 211 212 213 214 215 216 217 |
|
BlockDatabase #
Bases: Database[BlockScanner]
A database for block (non-streaming) scanning.
Source code in python/pyperscan/_pyperscan.pyi
219 220 |
|
VectoredDatabase #
Bases: Database[VectoredScanner]
A databes for vectored scanning.
Source code in python/pyperscan/_pyperscan.pyi
222 223 |
|
StreamDatabase #
Bases: Database[StreamScanner]
A database for stream scanning.
Source code in python/pyperscan/_pyperscan.pyi
225 226 |
|
Scan #
Match callback return value to instruct Hyperscan wether to contine or terminate scanning.
Source code in python/pyperscan/_pyperscan.pyi
228 229 230 231 232 233 234 |
|
BlockScanner #
Created from BlockDatabase
for block scanning.
Source code in python/pyperscan/_pyperscan.pyi
236 237 238 239 240 241 242 243 244 245 246 247 |
|
scan #
scan(data)
Scan for matches in a single buffer (block).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
BufferType
|
buffer to search matches in. Can be any object implementing the buffer protocol. |
required |
Returns:
Type | Description |
---|---|
Scan
|
Indicates if scanning is terminated from |
Source code in python/pyperscan/_pyperscan.pyi
239 240 241 242 243 244 245 246 247 |
|
VectoredScanner #
Created from VectoredDatabase
for scanning.
Source code in python/pyperscan/_pyperscan.pyi
249 250 251 252 253 254 255 256 257 258 259 260 |
|
scan #
scan(data)
Scan for matches in a multiple buffers (vector).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Collection[BufferType]
|
buffer to search matches in. Can be any object implementing the buffer protocol. |
required |
Returns:
Type | Description |
---|---|
Scan
|
Indicates if scanning is terminated from |
Source code in python/pyperscan/_pyperscan.pyi
252 253 254 255 256 257 258 259 260 |
|
StreamScanner #
Created from StreamDatabase
for stream scanning.
Source code in python/pyperscan/_pyperscan.pyi
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 |
|
scan #
scan(data, chunk_size=None)
Scan for matches in a stream.
Multiple calls constitute to the same scanning operation, matches can happen at
the edge of multiple scan
calls, and match start/end offsets are counted from
the first scan call.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
BufferType
|
buffer to search matches in. Can be any object implementing the buffer protocol. |
required |
chunk_size |
int | None
|
when provided, |
None
|
Tip
Hyperscan can match only up-to the first 4 GiB
of buffers. Use an
arbitrary big buffer with a chunk_size
less than 4 GiB
to overcome this
limitation.
Returns:
Type | Description |
---|---|
Scan
|
Indicates if scanning is terminated from |
Source code in python/pyperscan/_pyperscan.pyi
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 |
|
reset #
reset()
Reset stream scanning to its initial state.
After this method is called, all in-flight possible matches are discarded and
subsequent scan
operation will act as the first call, counting match index
from zero.
Source code in python/pyperscan/_pyperscan.pyi
284 285 286 287 288 289 290 |
|
HyperscanErrorCode #
List of errors can be returned by the low level Hyperscan operations.
Most error codes cannot appear in Pyperscan.
Source code in python/pyperscan/_pyperscan.pyi
292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 |
|
Invalid
class-attribute
instance-attribute
#
Invalid: HyperscanErrorCode = Ellipsis
A parameter passed to this function was invalid.
This error is only returned in cases where the function can detect an invalid parameter it cannot be relied upon to detect (for example) pointers to freed memory or other invalid data.
Nomem
class-attribute
instance-attribute
#
Nomem: HyperscanErrorCode = Ellipsis
A memory allocation failed.
ScanTerminated
class-attribute
instance-attribute
#
ScanTerminated: HyperscanErrorCode = Ellipsis
The engine was terminated by callback.
This return value indicates that the target buffer was partially scanned, but that the callback function requested that scanning cease after a match was located.
CompilerError
class-attribute
instance-attribute
#
CompilerError: HyperscanErrorCode = Ellipsis
The pattern compiler failed, and the hs_compile_error_t should be inspected for more detail.
DbVersionError
class-attribute
instance-attribute
#
DbVersionError: HyperscanErrorCode = Ellipsis
The given database was built for a different version of Hyperscan.
DbPlatformError
class-attribute
instance-attribute
#
DbPlatformError: HyperscanErrorCode = Ellipsis
The given database was built for a different platform (i.e., CPU type).
DbModeError
class-attribute
instance-attribute
#
DbModeError: HyperscanErrorCode = Ellipsis
The given database was built for a different mode of operation.
This error is returned when streaming calls are used with a block or vectored database and vice versa.
BadAlign
class-attribute
instance-attribute
#
BadAlign: HyperscanErrorCode = Ellipsis
A parameter passed to this function was not correctly aligned.
BadAlloc
class-attribute
instance-attribute
#
BadAlloc: HyperscanErrorCode = Ellipsis
The memory allocator (either malloc() or the allocator set with hs_set_allocator()) did not correctly return memory suitably aligned for the largest representable data type on this platform.
ScratchInUse
class-attribute
instance-attribute
#
ScratchInUse: HyperscanErrorCode = Ellipsis
The scratch region was already in use.
This error is returned when Hyperscan is able to detect that the scratch region given is already in use by another Hyperscan API call.
A separate scratch region, allocated with hs_alloc_scratch() or hs_clone_scratch(), is required for every concurrent caller of the Hyperscan API.
For example, this error might be returned when hs_scan() has been called inside a callback delivered by a currently-executing hs_scan() call using the same scratch region.
Note: Not all concurrent uses of scratch regions may be detected. This error is intended as a best-effort debugging tool, not a guarantee.
ArchError
class-attribute
instance-attribute
#
ArchError: HyperscanErrorCode = Ellipsis
Unsupported CPU architecture.
This error is returned when Hyperscan is able to detect that the current system does not support the required instruction set.
At a minimum, Hyperscan requires Supplemental Streaming SIMD Extensions 3 (SSSE3).
InsufficientSpace
class-attribute
instance-attribute
#
InsufficientSpace: HyperscanErrorCode = Ellipsis
Provided buffer was too small.
This error indicates that there was insufficient space in the buffer. The call should be repeated with a larger provided buffer.
Note: in this situation, it is normal for the amount of space required to be returned in the same manner as the used space would have been returned if the call was successful.
UnknownError
class-attribute
instance-attribute
#
UnknownError: HyperscanErrorCode = Ellipsis
Unexpected internal error.
This error indicates that there was unexpected matching behaviors. This could be related to invalid usage of stream and scratch space or invalid memory operations by users.
UnknownErrorCode
class-attribute
instance-attribute
#
UnknownErrorCode: HyperscanErrorCode = Ellipsis
An error unknown to Pyperscan happened.
HyperscanError #
An error happened while calling the low level Hyperscan API.
Source code in python/pyperscan/_pyperscan.pyi
392 393 394 395 396 |
|
args
instance-attribute
#
args: tuple[HyperscanErrorCode, int]
An error flag and low level error codes.
HyperscanCompileError #
One of the patterns to be compiled is invalid.
Source code in python/pyperscan/_pyperscan.pyi
398 399 400 401 402 |
|
args
instance-attribute
#
args: tuple[str, int]
Contains a human readable message and the index of the offending pattern.