Kubernetes textPayload Split

Hi

Within the Kubernetes Node parser, I am trying to split the textPayload into separate fields. The textPayload field contains long text which we're trying to extract and split the field into a key then the output of the field to a value. An example of a raw log (data nullified):

"textPayload": "time\u003d\"0000-00-00T00:00:00.0000000Z\" type\u003d\"container_app_firewall_audit\" container_name\u003d\"container-name-here\" image_name\u003d\"image-name/here:latest\"

How can I assign lets say for example time and type to its own separate UDM field OR automatically add an array with key and value pairs? I've configured extension parser as shown below here, however it's not splitting the values by spaces.

 

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => " "
      target => "textPayload_array"
    }
  }
  mutate {
    merge => {
      "event.idm.read_only_udm.target.description" => "textPayload_array"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 

 

0 9 175
9 REPLIES 9

Hi @ad9001 

There is no target.description field in UDM schema, you may use metadata.description instead.

Split function might may give you output in textPayload_array and considering no keys are given, consider checking by accessing values using fields like textPayload_array.0, textPayload_array.1 and so on. 

Hi @s_shubh 

Thank you for the suggestion, I tried adding path but had no luck. I tried adding json with array function for split column as well but this time I get an error message:

generic::unknown: pipeline.ParseLogEntry failed: LOG_PARSING_CBN_ERROR: "generic::invalid_argument: failed to convert raw output to events: failed to convert raw message 0: field \"idm\": index 0: recursive rawDataToProto failed: field \"read_only_udm\": index 0: recursive rawDataToProto failed: field \"metadata\": index 0: recursive rawDataToProto failed: panic encountered: non-string given for backstory.Metadata.description: []interface {} []interface {}{\"type=\\\"container_app_firewall_audit\"}"

 

CBN Snippet:

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }
    json {
        source => "message"
        array_function => "split_columns"
        on_error => "_not_json"
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => "\" "
      target => "textPayload_array"
    }
  }a
  mutate {
    merge => {
      "event.idm.read_only_udm.metadata.description" => "textPayload_array.1"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 Based on the statedump, I do see the following:

 "textPayload_array": {
    "0": "time=\"0000-00-00T00:00:00.0000Z",
    "1": "type=\"container_app_firewall_audit",
    "10": "source_ip=\"0.0.0.0",
    "11": "request_method=\"GET",
    "12": "request_user_agents=\"Go-http-client/1.1",
    "13": "request_host=\"0.0.0.0:0000",

In the first mutate block under if condition, I observed letter "a". Could you please try to remove it and run, if it's not a typo ? 

Also try using replace for description instead merge in mutate. 

 

 

Hi @s_shubh 

Adding a replace instead of a merge no longer shows error message however only shows the field.

UDM Output:

 

metadata.description: "textPayload_array.1"

 



Is there a better way to split and add all the arrays into their own fields or field of arrays? I've include more info which may help from the UDM Output Error, Statedump, and Raw log (sanitized).

UDM Output Error:

 

generic::unknown: pipeline.ParseLogEntry failed: LOG_PARSING_CBN_ERROR: "generic::invalid_argument: failed to convert raw output to events: failed to convert raw message 0: field \"idm\": index 0: recursive rawDataToProto failed: field \"read_only_udm\": index 0: recursive rawDataToProto failed: field \"metadata\": index 0: recursive rawDataToProto failed: panic encountered: non-string given for backstory.Metadata.description: []interface {} []interface {}{\"type=\\\"container_app_firewall_audit\"}"

 

 State-dump (sanitized):

 

Internal State (label=foo):

{
  "@createTimestamp": {
    "nanos": 0,
    "seconds": 1715722907
  },
  "@enableCbnForLoop": true,
  "@onErrorCount": 0,
  "@output": [
    {
      "idm": {
        "read_only_udm": {
          "metadata": {
            "description": [
              "type=\"container_app_firewall_audit"
            ]
          }
        }
      }
    }
  ],
  "@timezone": "",
  "_not_json": false,
  "event": {
    "idm": {
      "read_only_udm": {
        "metadata": {
          "description": [
            "type=\"container_app_firewall_audit"
          ]
        }
      }
    }
  },
  "insertId": "id-here",
  "labels": {
    "compute": {
      "googleapis": {
        "com/resource_name": "resource-name-here"
      }
    },
    "k8s-pod/app": "app-here",
    "k8s-pod/controller-revision-hash": "hash-here",
    "k8s-pod/pod-template-generation": "1"
  },
  "logName": "log-name",
  "message": "{\n  \"textPayload\": \"time\\u003d\\\"2024-05-04Z\\\" type\\u003d\\\"container_app_firewall_audit\\\" container_id\\u003d\\\"container-id-here\\\" container_name\\u003d\\\"container-name-here\\\" image_name\\u003d\\\"image-name-here\\\" hostname\\u003d\\\"hostname-here\\\" effect\\u003d\\\"prevent\\\" msg\\u003d\\\"Detected Code Injection attack in request body parameter \\\\\\\"bsh.script\\\\\\\", match exec(\\\\\\\"ipconfig\\\\\\\"), value exec(\\\\\\\"ipconfig\\\\\\\"), injection language: php\\\" log_type\\u003d\\\"codeInjection\\\" source_ip\\u003d\\\"0.0.0.0\\\" source_country\\u003d\\\"country\\\" connecting_ips\\u003d\\\"0.0.0.0\\\" request_method\\u003d\\\"POST\\\" request_user_agents\\u003d\\\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\\\" request_host\\u003d\\\"request-host-here.com\\\" request_url\\u003d\\\"request.url.here.com\\\" request_path\\u003d\\\"/path/is/here\\\" request_header_names\\u003d\\\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-info-here-Parent-Id,X-app-info-here-Sampling-Priority,X-app-info-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\\\" cluster\\u003d\\\"cluster-name-here\\\" attack_techniques\\u003d\\\"exploit-here\\\" rule_name\\u003d\\\"rule-name\\\" rule_app_id\\u003d\\\"name-of-rule-app\\\" protection\\u003d\\\"firewall\\\" attack_field_type\\u003d\\\"formBody\\\" attack_field_key\\u003d\\\"bsh.script\\\" attack_field_value\\u003d\\\"exec(\\\"ipconfig\\\")\\\" event_id\\u003d\\\"event-id\\\"\",\n  \"insertId\": \"id-is-here\",\n  \"resource\": {\n    \"type\": \"k8s_container\",\n    \"labels\": {\n      \"container_name\": \"container name\",\n      \"project_id\": \"project_id\",\n      \"namespace_name\": \"namespacehere\",\n      \"pod_name\": \"pod-name-here\",\n      \"cluster_name\": \"cluster-name-here\",\n      \"location\": \"location-here\"\n    }\n  },\n  \"timestamp\": \"2024-04-04\",\n  \"severity\": \"INFO\",\n  \"labels\": {\n    \"api-here/resource_name\": \"resource-name\",\n    \"k8s-pod/controller-revision-hash\": \"controller-here\",\n    \"k8s-pod/pod-here\": \"1\",\n    \"k8s-pod/app\": \"podapp\"\n  },\n  \"logName\": \"projectshere\",\n  \"receiveTimestamp\": \"2024-05-04\"\n}",
  "receiveTimestamp": "2024-05-04Z",
  "resource": {
    "labels": {
      "cluster_name": "cluster_name",
      "container_name": "conatiner-name",
      "location": "thelocation",
      "namespace_name": "namespacehere",
      "pod_name": "podnamehere",
      "project_id": "projectidt"
    },
    "type": "k8s_container"
  },
  "severity": "INFO",
  "textPayload": "time=\"2024-05-04TZ\" type=\"container_app_firewall_audit\" container_id=\"idhere\" container_name=\"namehere\" image_name=\"uimagename\" hostname=\"hostnamehere\" effect=\"prevent\" msg=\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php\" log_type=\"codeInjection\" source_ip=\"0.0.0.0\" source_country=\"countryhere\" connecting_ips=\"0.0.0.0\" request_method=\"POST\" request_user_agents=\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\" request_host=\"hosturlhere\" request_url=\"requesturlhere.com/path/here\" request_path=\"/path/here\" request_header_names=\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-appnamehere-Parent-Id,X-appnamehere-Sampling-Priority,X-appnamehere-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\" cluster=\"clusternamehere\" attack_techniques=\"exploit\" rule_name=\"rulenamehere\" rule_app_id=\"appnamehere\" protection=\"firewall\" attack_field_type=\"formBody\" attack_field_key=\"bsh.script\" attack_field_value=\"exec(\"ipconfig\")\" event_id=\"event-id8\"",
  "textPayload_array": {
    "0": "time=\"2024-05-04",
    "1": "type=\"container_app_firewall_audit",
    "10": "source_country=\"country",
    "11": "connecting_ips=\"0.0.0.0,1.2.3.4",
    "12": "request_method=\"POST",
    "13": "request_user_agents=\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7",
    "14": "request_host=\"request.host.here.com",
    "15": "request_url=\"request.url.here.com/path/here",
    "16": "request_path=\"/path/here",
    "17": "request_header_names=\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-here-Parent-Id,X-app-here-Sampling-Priority,X-app-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id",
    "18": "cluster=\"cluster-namehere",
    "19": "attack_techniques=\"exploit-here",
    "2": "container_id=\"container-id-here-123456789",
    "20": "rule_name=\"rule-name-here",
    "21": "rule_app_id=\"app-id-here",
    "22": "protection=\"firewall",
    "23": "attack_field_type=\"formBody",
    "24": "attack_field_key=\"bsh.script",
    "25": "attack_field_value=\"exec(\"ipconfig\")",
    "26": "event_id=\"123456-event-id-here\"",
    "3": "container_name=\"container-name",
    "4": "image_name=\"imagename-here",
    "5": "hostname=\"hostname-here",
    "6": "effect=\"prevent",
    "7": "msg=\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php",
    "8": "log_type=\"codeInjection",
    "9": "source_ip=\"0.0.0.0"
  },
  "timestamp": "2024-05-04"
}

 

Raw Log (Sanitized):

 

{
  "textPayload": "time\u003d\"2024-05-04TT00:00:00.00000Z\" type\u003d\"container_app_firewall_audit\" container_id\u003d\"container_idhere\" container_name\u003d\"containernamehere\" image_name\u003d\"imagenamehere\" hostname\u003d\"hostname\" effect\u003d\"prevent\" msg\u003d\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php\" log_type\u003d\"codeInjection\" source_ip\u003d\"0.0.0.0\" source_country\u003d\"country\" connecting_ips\u003d\"0.0.0.0\" request_method\u003d\"POST\" request_user_agents\u003d\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\" request_host\u003d\"requiresturlhere.com\" request_url\u003d\"requesthrul.com/path/here\" request_path\u003d\"/path/here\" request_header_names\u003d\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-name-here-Parent-Id,X-app-name-here-Sampling-Priority,X-app-name-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\" cluster\u003d\"projectnamehere\" attack_techniques\u003d\"exploit\" rule_name\u003d\"rulename\" rule_app_id\u003d\"app-name\" protection\u003d\"firewall\" attack_field_type\u003d\"formBody\" attack_field_key\u003d\"scriptt\" attack_field_value\u003d\"exec(\"ipconfig\")\" event_id\u003d\"eventidhere\"",
  "insertId": "id_here",
  "resource": {
    "type": "k8s_container",
    "labels": {
      "container_name": "twistlock-defender",
      "project_id": "projectnamehere",
      "namespace_name": "prismacloud",
      "pod_name": "podnamehere",
      "cluster_name": "clusternamehere",
      "location": "location"
    }
  },
  "timestamp": "2024-05-04TT00:00:00.00000Z",
  "severity": "INFO",
  "labels": {
    "compute.googleapis.com/resource_name": "resourcenamehere",
    "k8s-pod/controller-revision-hash": "hash_id",
    "k8s-pod/pod-template-generation": "1",
    "k8s-pod/app": "podname"
  },
  "logName": "projects/projectnamehere/logs/stdout",
  "receiveTimestamp": "2024-05-04TT00:00:00.00000Z"
}

 

Parser Extension:

 

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }
    json {
        source => "message"
        array_function => "split_columns"
        on_error => "_not_json"
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => "\" "
      target => "textPayload_array"
    }
  }
  mutate {
    merge => {
      "event.idm.read_only_udm.metadata.description" => "textPayload_array.1"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 



Hi,

Based on previous comments, I can see that "textPayload" contains multiple values; it's a list. Lists need to be mapped a to a repeated UDM field assuming you'd like to map all the contents of the list and you must use a for loop. I would suggest adding this data to the "additional" UDM field.

Hi @Rene_Figueroa and @s_shubh 

Thank you for the the suggestions, looks like the following parser below worked when setting the with a replace and "%{textPayload_array.1}" However now I run into issue where I need to split again from the equal sign and quote (=") to setup a KV where the "type" is the key and the value is "container_app_firewall_audit"

Also where sha'll I place the "For" loop within the parser I have below?

Output

metadata.description: "type="container_app_firewall_audit"



Parser

filter{

    mutate {

        replace => {

            "textPayload" => ""

        }

    }

    json {

        source => "message"

        array_function => "split_columns"

        on_error => "_not_json"

    }



if [textPayload] != "" 

{



  mutate {

    split => {

      source => "textPayload"

      separator => "\" "

      target => "textPayload_array"

    }

  }

  mutate {

    replace => {

      "event.idm.read_only_udm.metadata.description" => "%{textPayload_array.1}"

    }

  }

  mutate {

  merge => {

    "@output" => "event"

 }

}

}

statedump {

  label => "foo"

}

}

 

Since all of this data is JSON, the JSON function should split up everything all the values in key-value pairs. Also, do you want to map all the content of "textPayload_array" to the UDM event? If so, then metadata.description won't do since it does not hold arrays. That's why I suggested using "additional" instead. 

Do you have a whole sample raw log we can review?

Hi @Rene_Figueroa 

Yes, I would like to map the entire array to UDM events with Key/Value pairs so I can generate a rule detection to create an alert based on a specific value on a key label (field).

I've tried adding "additional" based on the following below, but get the following error. I followed the guide here based on the documentation but none of the combinations work.

Parser Error: 

generic::unknown: pipeline.ParseLogEntry failed: LOG_PARSING_CBN_ERROR: "generic::invalid_argument: pipeline failed: filter mutate (4) failed: replace failure: field \"event.idm.read_only_udm.additional.ListValue\": source field \"textPayload_array\": source field value must be a string"

 

Parser Update with Additional:

 

filter{

    mutate {

        replace => {

            "textPayload" => ""

        }

    }

    json {

        source => "message"

        array_function => "split_columns"

        on_error => "_not_json"

    }



if [textPayload] != "" 

{



  mutate {

    split => {

      source => "textPayload"

      separator => "\" "

      target => "textPayload_array"

    }

  }

  mutate {

    replace => {

      "event.idm.read_only_udm.additional.ListValue" => "%{textPayload_array}"

    }

  }


  mutate {

  merge => {

    "@output" => "event"

 }

}

}

statedump {

  label => "foo"

}

}


Raw Log (Sanitized):

{
  "textPayload": "time\u003d\"2024-05-04TT00:00:00.00000Z\" type\u003d\"container_app_firewall_audit\" container_id\u003d\"container_idhere\" container_name\u003d\"containernamehere\" image_name\u003d\"imagenamehere\" hostname\u003d\"hostname\" effect\u003d\"prevent\" msg\u003d\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php\" log_type\u003d\"codeInjection\" source_ip\u003d\"0.0.0.0\" source_country\u003d\"country\" connecting_ips\u003d\"0.0.0.0\" request_method\u003d\"POST\" request_user_agents\u003d\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\" request_host\u003d\"requiresturlhere.com\" request_url\u003d\"requesthrul.com/path/here\" request_path\u003d\"/path/here\" request_header_names\u003d\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-name-here-Parent-Id,X-app-name-here-Sampling-Priority,X-app-name-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\" cluster\u003d\"projectnamehere\" attack_techniques\u003d\"exploit\" rule_name\u003d\"rulename\" rule_app_id\u003d\"app-name\" protection\u003d\"firewall\" attack_field_type\u003d\"formBody\" attack_field_key\u003d\"scriptt\" attack_field_value\u003d\"exec(\"ipconfig\")\" event_id\u003d\"eventidhere\"",
  "insertId": "id_here",
  "resource": {
    "type": "k8s_container",
    "labels": {
      "container_name": "twistlock-defender",
      "project_id": "projectnamehere",
      "namespace_name": "prismacloud",
      "pod_name": "podnamehere",
      "cluster_name": "clusternamehere",
      "location": "location"
    }
  },
  "timestamp": "2024-05-04TT00:00:00.00000Z",
  "severity": "INFO",
  "labels": {
    "compute.googleapis.com/resource_name": "resourcenamehere",
    "k8s-pod/controller-revision-hash": "hash_id",
    "k8s-pod/pod-template-generation": "1",
    "k8s-pod/app": "podname"
  },
  "logName": "projects/projectnamehere/logs/stdout",
  "receiveTimestamp": "2024-05-04TT00:00:00.00000Z"
}

 

Hi, 

Our syntax supports "additional.string_value". You would need to map every key-value pair as following:

mutate {
replace => {
"textproto_array.key" => "%{keyvalue}"
"textproto_array.value.string_value" => "%{value}"
}
merge => {
"event.idm.read_only_udm.additional.fields" => "textproto"
}
}

You can enclosed the above in a for loop for all of your key-value pairs. Note that this will only work when the "textPayload" meets your parser criteria. 

I reviewed our default KUBERNETES_NODE parser and I see we already map "textpayload" in "about.labels", so you can write your rule to look for these UDM field already.