Skip to main content

Sensitivity Classification

Automatically identify and classify sensitive data across your database catalog, including PII, PCI, PHI, and other compliance-critical information.

VIDEO TUTORIAL⏱️ 4.5 mins

📹 Identifying PII and Compliance Issues

Learn how to use the Data Sensitivity Agent to automatically classify sensitive data and manage compliance requirements across your database.

Overview

The Data Sensitivity Agent automatically scans your database after catalog discovery to identify and classify sensitive information. This helps ensure compliance with data protection regulations and enables proper handling of sensitive data.

How It Works

Automatic Scanning

  1. Associate the Data Sensitivity Agent with your data source
  2. After each catalog scan, the agent automatically analyzes all tables and columns
  3. AI-powered classification identifies sensitive data patterns
  4. Results are displayed throughout the platform

Classification Coverage

The agent identifies:

  • PII (Personally Identifiable Information): Names, SSNs, tax IDs, addresses
  • PHI (Protected Health Information): Medical records, health identifiers
  • PCI (Payment Card Industry): Credit card numbers, CVV codes, payment data
  • HCI (Highly Confidential Information): Business-critical proprietary data
  • General: General business data
  • Unassigned: Not yet categorized

Viewing Classification Results

1. Dashboard Widget

The sensitivity analysis widget on the data source dashboard provides:

  • Overview of sensitivity distribution
  • Breakdown by sensitivity level (High, Medium, Low, None)
  • Count of classified tables and columns
  • Quick filters to drill down into specific categories

2. Catalog Integration

Classifications appear directly in the catalog:

  • Sensitivity badges on columns while browsing tables
  • Color-coded indicators for quick identification
  • Inline classification details

3. Agent View

Access the dedicated Data Sensitivity Agent view for comprehensive analysis:

Summary Section

  • Total classified items (tables and columns)
  • Distribution across sensitivity levels
  • Breakdown by data category (PII, PCI, PHI, HCI)
  • Interactive filters - click any metric to filter the detailed view

Classification Details

The detailed classification list shows:

FieldDescription
Entity NameTable or column name
Sensitivity LevelHigh, Medium, Low, None, or Unassigned
Data CategoryPII, PHI, PCI, HCI, General, or Unassigned
Confidence ScoreAI confidence in the classification (0-100%)
ReasonExplanation for the classification
StatusPending review, Approved, or Corrected

Sorting and Filtering

Sorting Options

Use the Sort dropdown to organize classifications:

  • Date (Newest first): Most recent classifications
  • Date (Oldest first): Oldest classifications
  • Sensitivity (High → Low): High sensitivity items first
  • Sensitivity (Low → High): Low sensitivity items first
  • Confidence (High → Low): Most confident classifications first
  • Confidence (Low → High): Least confident classifications first

Low confidence typically indicates:

  • Ambiguous column names (e.g., "AV", "FFF", "AF")
  • Free-form text fields that may contain mixed content
  • Custom business fields without clear patterns

Filter Options

Available filters in the activities view:

  • Sensitivity Level dropdown: All Levels, High, Medium, Low, None
  • Time Range dropdown: All Time, Today, Last 7 Days, Last 30 Days
  • Table Filter dropdown: All Tables, or specific table selection
  • Search box: Find specific activities by text
  • Clear button: Reset all filters when active

Review Mode

Enable Review Mode to systematically validate and correct classifications:

Entering Review Mode

  1. Toggle the Review Mode switch in the Classification Activities header
  2. The interface switches to a table-grouped review layout
  3. Each table can be expanded to show its column classifications

Review Actions

For each classification, you can:

Approve Classification

  • Click the Approve button (thumbs up icon)
  • Confirms the AI classification is correct
  • Moves to the next item automatically

Reject and Correct

  1. Click the Disapprove button (thumbs down icon)
  2. A popover opens with:
    • Sensitivity Level dropdown (High, Medium, Low, None)
    • Data Category dropdown (PII, PHI, PCI, HCI, General)
    • Explanation text area (optional)
  3. Select the correct classification
  4. Click Save Correction

Review Features in Review Mode

When in review mode:

  • Bulk selection: Select multiple items for batch approval
  • Table grouping: Classifications grouped by table
  • Expandable tables: Click chevron to expand/collapse table columns
  • Pagination: Navigate through large result sets
  • Filter controls: Sensitivity level, category, and review status filters

Machine Learning Feedback

Corrections are fed back into the classification algorithm:

  • The system learns from your corrections
  • Future classifications improve based on feedback
  • Pattern recognition adapts to your data conventions

Best Practices

Regular Reviews

  • Review classifications after initial scanning
  • Focus on high-value or high-risk data first
  • Periodically review low-confidence classifications

Handling Ambiguous Fields

  • Document business-specific field meanings
  • Provide clear corrections with explanations
  • Consider renaming ambiguous columns for clarity

Compliance Considerations

  • Ensure all PCI data is properly identified
  • Verify PHI classifications for healthcare compliance
  • Document classification decisions for audit trails

Integration with Other Features

Data Catalog

  • Classifications appear inline while browsing
  • Edit column metadata to update classifications
  • Export classification reports

Access Control

  • Use classifications to inform access policies
  • Restrict access to highly sensitive data
  • Monitor access to classified data

Data Quality

  • Correlate sensitivity with data quality rules
  • Apply stricter validation to sensitive fields
  • Monitor changes to sensitive data

Troubleshooting

Common Issues

IssueCauseSolution
Missing classificationsAgent not associated with data sourceAdd Data Sensitivity Agent to data source
Low confidence scoresAmbiguous column namesReview and correct, consider renaming columns
Incorrect categoriesDomain-specific data patternsProvide corrections to train the model
Classifications not updatingCache delayWait for next scan or trigger manual refresh

API Access

Classifications can be accessed programmatically:

  • Query classifications by table or column
  • Submit corrections via API
  • Export classification reports
  • Integrate with data governance tools