Regex conditionals
Ingest pipelines support conditional logic using regular expressions (regex) with the Painless scripting language. This allows fine-grained control over which documents get processed based on the structure and contents of text fields. Regex can be used within the if
parameter to evaluate string patterns. This is especially useful for matching IP formats, validating email addresses, identifying UUIDs, or processing logs with specific keywords.
Example: Email domain filtering
The following pipeline uses regex to identify users from the @example.com
email domain and tag those documents accordingly:
PUT _ingest/pipeline/tag_example_com_users
{
"processors": [
{
"set": {
"field": "user_domain",
"value": "example.com",
"if": "ctx.email != null && ctx.email =~ /@example.com$/"
}
}
]
}
Use the following request to simulate the pipeline:
POST _ingest/pipeline/tag_example_com_users/_simulate
{
"docs": [
{ "_source": { "email": "[email protected]" } },
{ "_source": { "email": "[email protected]" } }
]
}
Only the first document has user_domain
added:
{
"docs": [
{
"doc": {
"_source": {
"email": "[email protected]",
"user_domain": "example.com"
}
}
},
{
"doc": {
"_source": {
"email": "[email protected]"
}
}
}
]
}
Example: Detect IPv6 addresses
The following pipeline uses regex to identify and flag IPv6-formatted addresses:
PUT _ingest/pipeline/ipv6_flagger
{
"processors": [
{
"set": {
"field": "ip_type",
"value": "IPv6",
"if": "ctx.ip != null && ctx.ip =~ /^[a-fA-F0-9:]+$/ && ctx.ip.contains(':')"
}
}
]
}
Use the following request to simulate the pipeline:
POST _ingest/pipeline/ipv6_flagger/_simulate
{
"docs": [
{ "_source": { "ip": "2001:0db8:85a3:0000:0000:8a2e:0370:7334" } },
{ "_source": { "ip": "192.168.0.1" } }
]
}
The first document contains an added ip_type
field set to IPv6
:
{
"docs": [
{
"doc": {
"_source": {
"ip": "2001:0db8:85a3:0000:0000:8a2e:0370:7334",
"ip_type": "IPv6"
}
}
},
{
"doc": {
"_source": {
"ip": "192.168.0.1"
}
}
}
]
}
Example: Validate UUID strings
The following pipeline uses regex to verify whether a session_id
field contains a valid UUID:
PUT _ingest/pipeline/uuid_checker
{
"processors": [
{
"set": {
"field": "valid_uuid",
"value": true,
"if": "ctx.session_id != null && ctx.session_id =~ /^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$/"
}
}
]
}
Use the following request to simulate the pipeline:
POST _ingest/pipeline/uuid_checker/_simulate
{
"docs": [
{ "_source": { "session_id": "550e8400-e29b-41d4-a716-446655440000" } },
{ "_source": { "session_id": "invalid-uuid-1234" } }
]
}
The first document is tagged with a new valid_uuid
field:
{
"docs": [
{
"doc": {
"_source": {
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"valid_uuid": true
}
}
},
{
"doc": {
"_source": {
"session_id": "invalid-uuid-1234"
}
}
}
]
}