Creating a Custom Data Mask Plugin

admin 4 2025-01-12 编辑

Creating a Custom Data Mask Plugin

Creating a custom plugin for in Lua might be trivial or daunting, depending on your level of expertise in +OpenResty+Nginx. In this article, we will look at how you can create and run a custom plugin from the ground up while learning some basics of plugin development.

When talking to one of our users from the fintech industry during the Apache Community Meetup in Malaysia, we came across a peculiar feature request: mask confidential data in responses.

For example, a response from the upstream might contain sensitive data like credit card numbers, and should be able to replace it with ******* based on some predefined rules.

Creating such a plugin in Lua might be trivial or daunting, depending on your level of expertise in +OpenResty+Nginx. So in this article, we will look at how you can create and run this plugin from the ground up while learning some basics of plugin development in Lua.

Setting Things Up​

You can start with the template plugin from -plugin-template. This contains boilerplate code for creating custom Lua plugins for .

To use the template, go to the repository and click "Use this template." You can then clone it to your local machine for modification.

Under the /plugins directory, you will find a file named demo.lua. You can rename this to data-mask.lua. This will be the starting point for our custom plugin.

Initially, the main parts of the file will look like this containing some boilerplate code which includes some imports and variable definitions:

data-mask.lua
-- local common libslocal require = requirelocal core    = require(".core")-- module definelocal plugin_name = "data-mask"-- plugin schemalocal plugin_schema = {    type = "object",    properties = {},    required = {},}local _M = {    version  = 0.1,            -- plugin version    priority = 0,              -- the priority of this plugin will be 0    name     = plugin_name,    -- plugin name    schema   = plugin_schema,  -- plugin schema}-- module interface for schema check-- @param `conf` user defined conf data-- @param `schema_type` defined in `/core/schema.lua`-- @return <boolean>function _M.check_schema(conf, schema_type)    return core.schema.check(plugin_schema, conf)end-- module interface for header_filter phasefunction _M.header_filter(conf, ctx)end-- module interface for body_filter phasefunction _M.body_filter(conf, ctx)endreturn _M

There are three functions (interfaces for the plugin) declared on the structure _M:

  1. check_schema: used for validating the plugin configuration and is called when this plugin is enabled on a route.
  2. header_filter and
  3. body_filter: for modifying the response header and body, respectively, before sending it to the client.

In the end, this returns _M, and the can use the data from this to get the metadata and functions from the plugin.

Designing the Plugin​

Like every sound engineer, let's first design the plugin before we start writing code.

The goal of this plugin is simple:

  1. The user should be able to define what sensitive data would look like in the plugin configuration (maybe RegEx?).
  2. They should be able to define what sensitive data should be replaced with (like *******).
  3. should then modify requests and responses based on these configurations.

So each rule can contain a regular expression and a replacement string. This rule will be applied to the response, and the masked data will be returned to the client:

{  "rules": [    {      "regex": ".*",      "replace": "******"    },    {      "regex": ".*",      "replace": "**"    }  ]}

We can now define the JSON schema to validate the plugin configuration. This can help avoid issues with improper plugin configurations during runtime:

local plugin_schema = {    type = "object",    properties = {        rules = {            type = "array",            items = {                type = "object",                properties = {                    regex = {                        type = "string",                        minLength = 1,                    },                    replace = {                        type = "string",                    },                },                required = {                    "regex",                    "replace",                },                additionalProperties = false,            },            minItems = 1,        },    },    required = {        "rules",    },}

The rules here is an array of objects meaning you can have multiple rules for defining what sensitive data should look like and what it should be replaced with. Each object in the array contains two required string fields, regex and replace, just like we designed.

Let's Write Some Code!​

We have now decided what the plugin's functionality would look like and added some JSON schema to validate the plugin's configuration.

We will first modify the _M.header_filter function, which is called before the response header is sent to the client. But why are we changing this? Isn't our plugin supposed to modify the response body?

Well, yes. But when we modify the data in the response body (from 2378-4531-5789-1369 to 2378-\***\*-\*\***-1369), the Content-Length header will no longer be accurate. This can cause the client to interpret that the data returned by the server is abnormal and fail to complete the request.

Since we haven't modified the request body yet, we cannot calculate the new, accurate value for the Content-Length header.  So we need to delete this header value, modify the response body, recalculate the new header value, and set it to the response. To do this in a single sweep, provides the core.response.clear_header_as_body_modified function:

function _M.header_filter(conf, ctx)    core.response.clear_header_as_body_modified()end

We can now work on modifying the response body to mask the data. To do this, we must modify the _M.body_filter function.

Sometimes, the upstream response will be sent in chunks (Content-Encoding: chunked), and the body_filter function will be called multiple times. Since each of these chunks are incomplete in itself, we need to cache the data passed each time and call the body_filter function only when all blocks are received and spliced together. And like before, provides a function, core.response.hold_body_chunk to handle this scenario:

local body = core.response.hold_body_chunk(ctx)if not body then    returnend

Now to mask the response data, we can use the ngx.re.gsub function, which takes in a regular expression and a replacement string and replaces matching strings with the replacement string.

The RegEx conforms to the PCRE specification. For example, when the expression is (.*)-(.*)-(.*)-(.*), it will extract the four variables separated by -, and you can use $1, $2, $3, and $4 in the replacement string to refer to the four variables:

for _, rule in ipairs(conf.rules) do    body = ngx.re.gsub(body, rule.regex, rule.replace, "jo")end

Finally, to set this as the new response body, we will modify the value of ngx.arg[1] as mentioned in the OpenResty docs. Once we set the value of ngx.arg[2] to true, will send the new response body to the client:

ngx.arg[1] = bodyngx.arg[2] = true

Combining all these, the body_filter function will look like this:

function _M.body_filter(conf, ctx)    local body = core.response.hold_body_chunk(ctx)    if not body then        return    end    for _, rule in ipairs(conf.rules) do        body = ngx.re.gsub(body, rule.regex, rule.replace, "jo")    end    ngx.arg[1] = body    ngx.arg[2] = trueend

Now the only thing left to do is glue everything together. The entire plugin code will look like this:

-- local common libslocal require     = requirelocal ipairs      = ipairslocal ngx_re_gsub = ngx.re.gsublocal core        = require(".core")-- module definelocal plugin_name = "data-mask"-- plugin schemalocal plugin_schema = {    type = "object",    properties = {        rules = {            type = "array",            items = {                type = "object",                properties = {                    regex = {                        type = "string",                        minLength = 1,                    },                    replace = {                        type = "string",                    },                },                required = {                    "regex",                    "replace",                },                additionalProperties = false,            },            minItems = 1,        },    },    required = {        "rules",    },}local _M = {    version  = 0.1,            -- plugin version    priority = 0,              -- the priority of this plugin will be 0    name     = plugin_name,    -- plugin name    schema   = plugin_schema,  -- plugin schema}-- module interface for schema check-- @param `conf` user defined conf data-- @param `schema_type` defined in `/core/schema.lua`-- @return <boolean>function _M.check_schema(conf, schema_type)    return core.schema.check(plugin_schema, conf)</s

Creating a Custom Data Mask Plugin

上一篇: Understanding the Significance of 3.4 as a Root in Mathematics
下一篇: Apache integrates with Open Policy Agent
相关文章