Actions

Actions transform data validation from passive reporting to active response by automatically executing code when quality issues arise. They bridge the gap between detection and intervention, enabling immediate notifications and comprehensive logging when thresholds are exceeded.

Whether you need simple console messages for interactive analysis or complex alerting for production pipelines, Actions provide the framework to make your validation workflows responsive. For example, when validating revenue values, you can configure immediate alerts if failures exceed acceptable thresholds, ensuring data issues are addressed promptly rather than discovered later.

In this article, we’ll explore how to use Actions to respond to threshold violations during data validation, and Final Actions to execute code after all validation steps are complete, giving you powerful tools to monitor, alert, and report on your data’s quality.

How Actions Work

Let’s look at an example on how this works in practice. The following validation plan contains a single step (using col_vals_gt()) where the thresholds= and actions= parameters are set using Thresholds and Actions calls:

import pointblank as pb

(
    pb.Validate(data=pb.load_dataset(dataset="small_table"))
    .col_vals_gt(
        columns="c", value=2,
        thresholds=pb.Thresholds(warning=1, error=5),

        # Emit a console message when the warning threshold is exceeded ---
        actions=pb.Actions(warning="WARNING: failing test found.")
    )
    .interrogate()
)

WARNING: failing test found.

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C
Pointblank Validation
2025-05-21\|17:29:26 Polars
#AAAAAA	1	col_vals_gt()	c	2	✓	13	10 0.77	3 0.23	●	○	—

The code uses thresholds=pb.Thresholds(warning=1, error=5) to set a ‘warning’ threshold of 1 and an ‘error’ threshold of 5 failing test units. The results part of the validation table shows that:

The FAIL column shows that 3 tests units have failed
The W column (short for ‘warning’) shows a filled gray circle indicating it’s reached its threshold level
The E (‘error’) column shows an open yellow circle indicating it’s below the threshold level

More importantly, the text "WARNING: failing test found." has been emitted. Here it appears above the validation table and that’s because the action is executed eagerly during interrogation (before the report has even been generated).

So, an action is executed for a particular condition (e.g., ‘warning’) within a validation step if these three things are true:

there is a threshold set for that condition (either globally, or as part of that step)
there is an associated action set for the condition (again, either set globally or within the step)
during interrogation, the threshold value for the condition was exceeded by the number or proportion of failing test units

There is a lot of flexibility for setting both thresholds and actions and everything here is considered optional. Put another way, you can set various thresholds and various actions as needed and the interrogation phase will determine whether all the requirements are met for executing an action.

Defining Actions

Actions can be defined in several ways, providing flexibility for different notification needs.

Using String Messages

There are a few options in how to define the actions:

String: a message to be displayed in the console
Callable: a function to be called
List of Strings/Callables: for execution of multiple messages or functions

The actions are executed at interrogation time when the threshold level assigned to the action is exceeded by the number or proportion of failing test units. When providing a string, it will simply be printed to the console. A callable will also be executed at the time of interrogation. If providing a list of strings or callables, each item in the list will be executed in order. Such a list can contain a mix of strings and callables.

Displaying console messages may be a simple approach, but it is effective. And the strings don’t have to be static, there are templating features that can be useful for constructing strings for a variety of situations. The following placeholders are available for use:

{type}: The validation step type where the action is executed (e.g., ‘col_vals_gt’, etc.)
{level}: The threshold level where the action is executed (‘warning’, ‘error’, or ‘critical’)
{step} or {i}: The step number in the validation workflow where the action is executed
{col} or {column}: The column name where the action is executed
{val} or {value}: An associated value for the validation method
{time}: A datetime value for when the action was executed

Here’s an example where we prepare a console message with a number of value placeholders (action_str) and use it globally at Actions(critical=):

action_str = "[{LEVEL}: {TYPE}]: Step {step} has failed validation. ({time})"

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),

        # Use `action_str` for any critical thresholds exceeded ---
        actions=pb.Actions(critical=action_str),
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.10)
    .col_vals_ge(columns="session_duration", value=15)
    .interrogate()
)

[CRITICAL: COL_VALS_GT]: Step 2 has failed validation. (2025-05-21 17:29:27.345278+00:00)
[CRITICAL: COL_VALS_GE]: Step 3 has failed validation. (2025-05-21 17:29:27.369316+00:00)

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:27 DuckDBWARNING0.05ERROR0.1CRITICAL0.15
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	○	—
#FF3300	2	col_vals_gt()	item_revenue	0.1	✓	2000	1440 0.72	560 0.28	●	●	●	—
#FF3300	3	col_vals_ge()	session_duration	15	✓	2000	1686 0.84	314 0.16	●	●	●	—

What we get here are two messages in the console, corresponding to critical failures in steps 2 and 3. The placeholders were replaced with the correct text for the context. Note that some of the resulting text is capitalized (e.g., "CRITICAL", "COL_VALS_GT", etc.) and this is because we capitalized the placeholder text itself. Have a look at the documentation article of Actions for more details on this.

Using Callable Functions

Aside from strings, any callable can be used as an action value. Here’s an example where we use a custom function as part of an action:

def duration_issue():
    from datetime import datetime
    print(f"Data quality issue found ({datetime.now()}).")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.05)
    .col_vals_gt(
        columns="session_duration", value=15,

        # Use the `duration_issue()` function as an action for this step ---
        actions=pb.Actions(warning=duration_issue),
    )
    .interrogate()
)

Data quality issue found (2025-05-21 17:29:27.744080).

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:27 DuckDBWARNING0.05ERROR0.1CRITICAL0.15
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	○	—
#EBBC14	2	col_vals_gt()	item_revenue	0.05	✓	2000	1701 0.85	299 0.15	●	●	○	—
#FF3300	3	col_vals_gt()	session_duration	15	✓	2000	1675 0.84	325 0.16	●	●	●	—

In this case, the ‘warning’ action is set to call the user’s dq_issue() function. This action is only executed when the ‘warning’ threshold is exceeded in step 3. Because all three thresholds are exceeded in that step, the ‘warning’ action of executing the function occurs (resulting in a message being printed to the console).

This is an example where actions can be defined locally for an individual validation step. The global threshold setting applied to all three validation steps but the step-level action only applied to step 3. You are free to mix and match both threshold and action settings at the global level (i.e., set in the Validate call) or at the step level. The key thing to be aware of is that step-level settings of thresholds and actions take precedence.

Accessing Context in Actions

While string templates provide helpful placeholders to access information about validation steps, callable functions offer more flexibility through access to detailed metadata. When using functions as actions, you can retrieve comprehensive information about the validation context, allowing for complex logic and dynamic responses to validation issues.

Using `get_action_metadata()` in Callables

To access information about the validation step where an action was triggered, we can call get_action_metadata() in the body of a function to be used within Actions. This provides useful context about the validation step that triggered the action.

def print_problem():
    m = pb.get_action_metadata()
    print(f"{m['level']} ({m['level_num']}) for Step {m['step']}: {m['failure_text']}")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),

        # Use the `print_problem()` function as the action ---
        actions=pb.Actions(default=print_problem),
        brief=True,
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.05)
    .col_vals_gt(columns="session_duration", value=15)
    .interrogate()
)

error (40) for Step 2: Exceedance of failed test units where values in `item_revenue` should have been > `0.05`.
critical (50) for Step 3: Exceedance of failed test units where values in `session_duration` should have been > `15`.

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:27 DuckDBWARNING0.05ERROR0.1CRITICAL0.15
#4CA64C	1	col_vals_regex() Expect that values in `player_id` should match the regular expression: [A-Z]{12}\d{3}.	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	○	—
#EBBC14	2	col_vals_gt() Expect that values in `item_revenue` should be > `0.05`.	item_revenue	0.05	✓	2000	1701 0.85	299 0.15	●	●	○	—
#FF3300	3	col_vals_gt() Expect that values in `session_duration` should be > `15`.	session_duration	15	✓	2000	1675 0.84	325 0.16	●	●	●	—

In this example, we’re creating a function called print_problem() that prints information about each validation step that fails. We then apply this function as the default action for all threshold levels using actions=pb.Actions(default=print_problem). (Note that the default= and highest_only= parameters will be covered in more detail in following sections.)

We end up seeing two messages printed for failures in Steps 2 and 3. And though those steps had more than one threshold exceeded, only the most severe level in each yielded a console message (due to the default highest_only=True behavior).

By setting the action in Validate(actions=), we applied it to all validation steps where thresholds are exceeded. This eliminates the need to set actions= at every validation step (though you can do this as a local override, even setting actions=None to disable globally set actions).

Available Metadata Fields

The dictionary returned by get_action_metadata() contains the following fields:

step: The step number.
column: The column name.
value: The value being compared (only available in certain validation steps).
type: The assertion type (e.g., "col_vals_gt", etc.).
time: The time the validation step was executed (in ISO format).
level: The severity level ("warning", "error", or "critical").
level_num: The severity level as a numeric value (30, 40, or 50).
autobrief: A localized and brief statement of the expectation for the step.
failure_text: Localized text that explains how the validation step failed.

Customizing Action Behavior

The Actions class has two additional parameters that provide more control over how actions are executed:

Setting Default Actions with `default=`

Instead of specifying actions separately for each threshold level, you can use the default parameter to set a common action for all levels:

def log_all_issues():
    m = pb.get_action_metadata()
    print(f"[{m['level'].upper()}] Validation failed in step {m['step']} with level {m['level']}")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),

        # The `log_all_issues()` callable is set to every threshold ---
        actions=pb.Actions(default=log_all_issues),
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.05)
    .col_vals_gt(columns="session_duration", value=15)
    .interrogate()
)

[ERROR] Validation failed in step 2 with level error
[CRITICAL] Validation failed in step 3 with level critical

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:28 DuckDBWARNING0.05ERROR0.1CRITICAL0.15
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	○	—
#EBBC14	2	col_vals_gt()	item_revenue	0.05	✓	2000	1701 0.85	299 0.15	●	●	○	—
#FF3300	3	col_vals_gt()	session_duration	15	✓	2000	1675 0.84	325 0.16	●	●	●	—

The default= parameter sets the same action for all threshold levels. If you later specify an action for a specific level, it will override this default for that level only.

When using the default= parameter, be aware that your action (whether a string template or callable function) needs to work across all validation steps where thresholds might be exceeded. Not all validation methods provide the same context for string templates or in the metadata dictionary returned by get_action_metadata().

For example, some validation steps like col_vals_gt() provide a value field that can be accessed with {value} in string templates, while others like col_exists() don’t have this concept. When creating default actions, either use only the universally available placeholders ({step}, {level}, {type}, and {time}), or include conditional logic in your callable functions to handle different validation types appropriately.

Controlling Action Execution with `highest_only=`

By default, Pointblank only executes the action for the most severe threshold level that’s been exceeded. If you want actions for all exceeded thresholds to be executed, you can set highest_only=False:

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),
        actions=pb.Actions(
            warning="Warning threshold exceeded in step {step}",
            error="Error threshold exceeded in step {step}",
            critical="Critical threshold exceeded in step {step}",

            # Execute all applicable actions ---
            highest_only=False
        ),
    )
    .col_vals_gt(columns="session_duration", value=15)
    .interrogate()
)

Critical threshold exceeded in step 1
Error threshold exceeded in step 1
Warning threshold exceeded in step 1

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:28 DuckDBWARNING0.05ERROR0.1CRITICAL0.15
#FF3300	1	col_vals_gt()	session_duration	15	✓	2000	1675 0.84	325 0.16	●	●	●	—

In this example, if all three thresholds are exceeded in a step, you’ll see all three messages printed, rather than just the critical one.

The default behavior (highest_only=True) helps prevent notification fatigue by limiting the number of actions executed when multiple thresholds are exceeded in the same validation step. For example, if a validation step fails with 60% of rows not passing, it would exceed ‘warning’, ‘error’, and ‘critical’ thresholds simultaneously. With highest_only=True, only the critical action would execute.

You might want to set highest_only=False when:

different threshold levels need to trigger different types of notifications (e.g., warnings to Slack, errors to email, critical to urgent notifications)
you need comprehensive logging of all severity levels for audit purposes
you’re building a dashboard that displays counts of issues at each severity level

Using Multiple Actions for a Threshold

You can specify multiple actions to be executed for a single threshold level by providing a list:

def send_notification():
    print("📧 Notification sent to data team")

def log_to_system():
    print("📝 Issue logged in system")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(critical=0.15),

        # Set multiple actions for the critical threshold exceedance ---
        actions=pb.Actions(
            critical=[
                "CRITICAL: Data validation failed",  # First action: display message
                send_notification,                   # Second action: call function
                log_to_system                        # Third action: call another function
            ]
        ),
    )
    .col_vals_gt(columns="session_duration", value=15)
    .interrogate()
)

CRITICAL: Data validation failed
📧 Notification sent to data team
📝 Issue logged in system

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:28 DuckDBWARNING—ERROR—CRITICAL0.15
#FF3300	1	col_vals_gt()	session_duration	15	✓	2000	1675 0.84	325 0.16	—	—	●	—

When providing a list of actions, they will be executed in sequence when the threshold is exceeded. This allows you to combine different types of actions such as displaying messages, sending notifications, and logging events.

Final Actions

Creating Final Actions

When you need to execute actions after all validation steps are complete, Pointblank provides the FinalActions class. Unlike Actions which triggers on a per-step basis during the validation process, FinalActions executes after the entire validation is complete, giving you a way to respond to the overall validation results.

Here’s how to use FinalActions:

def send_alert():
    summary = pb.get_validation_summary()
    if summary["highest_severity"] == "critical":
        print(f"ALERT: Critical validation failures found in `{summary['tbl_name']}`")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        tbl_name="game_revenue",
        thresholds=pb.Thresholds(warning=0.05, error=0.10, critical=0.15),

        # Set final actions to be executed after all interrogations ---
        final_actions=pb.FinalActions(
            "Validation complete.",  # 1. a string message
            send_alert               # 2. a callable function
        )
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.10)
    .interrogate()
)

Validation complete.
ALERT: Critical validation failures found in `game_revenue`

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:28 DuckDBgame_revenueWARNING0.05ERROR0.1CRITICAL0.15
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	○	—
#FF3300	2	col_vals_gt()	item_revenue	0.1	✓	2000	1440 0.72	560 0.28	●	●	●	—

In this example:

We define the function send_alert() that checks the validation summary for critical failures
We provide a simple string message "Validation complete." that will print to the console
Both actions will execute in order after all validation steps have completed

Because the ‘critical’ threshold was exceeded in Step 2, we see the printed alert of send_alert() after the simple string message.

FinalActions accepts any number of actions as positional arguments. Each argument can be:

String: A message to be displayed in the console
Callable: A function to be called with no arguments
List of Strings/Callables: Multiple actions to execute in sequence

All actions will be executed in the order they are provided after all validation steps have completed.

Using `get_validation_summary()` in Final Actions

When creating a callable function to use with FinalActions, you can access information about the overall validation results using the get_validation_summary() function. This gives you a dictionary with comprehensive information about the validation:

def comprehensive_report():
    summary = pb.get_validation_summary()
    print(f"Validation Report for {summary['tbl_name']}:")
    print(f"- Steps: {summary['n_steps']}")
    print(f"- Passing steps: {summary['n_passing_steps']}")
    print(f"- Failing steps: {summary['n_failing_steps']}")

    # Take additional actions based on results
    if summary["n_failing_steps"] > 0:

        # Create a Slack notification function ---
        notify = pb.send_slack_notification(
            webhook_url="https://hooks.slack.com/services/your/webhook/url",
            summary_msg="""
            🚨 *Validation Failure Alert*
            • Table: {tbl_name}
            • Failed Steps: {n_failing_steps} of {n_steps}
            • Highest Severity: {highest_severity}
            • Time: {time}
            """,
        )

        # Execute the notification function
        notify()

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        tbl_name="game_revenue",
        final_actions=pb.FinalActions(comprehensive_report),
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.05)
    .interrogate()
)

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:28 DuckDBgame_revenue
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	—	—	—	—
#4CA64C66	2	col_vals_gt()	item_revenue	0.05	✓	2000	1701 0.85	299 0.15	—	—	—	—

Here we used the send_slack_notification() function, which is available in Pointblank as a pre-built action. It can be used by itself in final_actions= but here it’s integrated into the user’s comprehensive_report() function to provide finer control with conditional logic.

Combining Step-level and Final Actions

You can use both Actions and FinalActions together for comprehensive validation control:

def log_step_failure():
    m = pb.get_action_metadata()
    print(f"Step {m['step']} failed with {m['level']}")


def generate_summary():
    summary = pb.get_validation_summary()
    # Sum up total failed test units across all steps
    total_failed = sum(summary["dict_n_failed"].values())
    # Sum up total test units across all steps
    total_units = sum(summary["dict_n"].values())
    print(f"Validation complete: {total_failed} failures out of {total_units} tests")

(
    pb.Validate(
        data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
        thresholds=pb.Thresholds(warning=0.05, error=0.10),

        # Set an action for each step (highest threshold exceeded) ---
        actions=pb.Actions(default=log_step_failure),

        # Set a final action to get a summary of the validation process ---
        final_actions=pb.FinalActions(generate_summary),
    )
    .col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
    .col_vals_gt(columns="item_revenue", value=0.05)
    .interrogate()
)

Step 2 failed with error
Validation complete: 299 failures out of 4000 tests

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	E	C	EXT
Pointblank Validation
2025-05-21\|17:29:29 DuckDBWARNING0.05ERROR0.1CRITICAL—
#4CA64C	1	col_vals_regex()	player_id	[A-Z]{12}\d{3}	✓	2000	2000 1.00	0 0.00	○	○	—	—
#EBBC14	2	col_vals_gt()	item_revenue	0.05	✓	2000	1701 0.85	299 0.15	●	●	—	—

This approach allows you to:

log individual step failures during the validation process using Actions
generate a comprehensive report after all validation steps are complete using FinalActions

Using both action types gives you fine-grained control over when and how notifications and other actions are triggered in your validation workflow.

Conclusion

Actions provide a powerful mechanism for responding to data validation results in Pointblank. By combining threshold settings with appropriate actions, you can create sophisticated data quality workflows that:

provide immediate feedback through console messages
execute custom functions when validation thresholds are exceeded
customize notifications based on severity levels
generate comprehensive reports after validation is complete
automate responses to data quality issues

The flexible design of Actions and FinalActions allows you to start simple with basic console messages and gradually build up to complex validation workflows with conditional logic, custom reporting, and integrations with other systems like Slack, email, or logging services.

When designing your validation strategy, consider leveraging both step-level actions for immediate responses and final actions for holistic reporting. This combination provides comprehensive control over your data validation process and helps ensure that data quality issues are detected, reported, and addressed efficiently.