Sensitivity Classification
Automatically identify and classify sensitive data across your database catalog, including PII, PCI, PHI, and other compliance-critical information.
📹 Identifying PII and Compliance Issues
Learn how to use the Data Sensitivity Agent to automatically classify sensitive data and manage compliance requirements across your database.
Overview
The Data Sensitivity Agent automatically scans your database after catalog discovery to identify and classify sensitive information. This helps ensure compliance with data protection regulations and enables proper handling of sensitive data.
How It Works
Automatic Scanning
- Associate the Data Sensitivity Agent with your data source
- After each catalog scan, the agent automatically analyzes all tables and columns
- AI-powered classification identifies sensitive data patterns
- Results are displayed throughout the platform
Classification Coverage
The agent identifies:
- PII (Personally Identifiable Information): Names, SSNs, tax IDs, addresses
- PHI (Protected Health Information): Medical records, health identifiers
- PCI (Payment Card Industry): Credit card numbers, CVV codes, payment data
- HCI (Highly Confidential Information): Business-critical proprietary data
- General: General business data
- Unassigned: Not yet categorized
Viewing Classification Results
1. Dashboard Widget
The sensitivity analysis widget on the data source dashboard provides:
- Overview of sensitivity distribution
- Breakdown by sensitivity level (High, Medium, Low, None)
- Count of classified tables and columns
- Quick filters to drill down into specific categories
2. Catalog Integration
Classifications appear directly in the catalog:
- Sensitivity badges on columns while browsing tables
- Color-coded indicators for quick identification
- Inline classification details
3. Agent View
Access the dedicated Data Sensitivity Agent view for comprehensive analysis:
Summary Section
- Total classified items (tables and columns)
- Distribution across sensitivity levels
- Breakdown by data category (PII, PCI, PHI, HCI)
- Interactive filters - click any metric to filter the detailed view
Classification Details
The detailed classification list shows:
Field | Description |
---|---|
Entity Name | Table or column name |
Sensitivity Level | High, Medium, Low, None, or Unassigned |
Data Category | PII, PHI, PCI, HCI, General, or Unassigned |
Confidence Score | AI confidence in the classification (0-100%) |
Reason | Explanation for the classification |
Status | Pending review, Approved, or Corrected |
Sorting and Filtering
Sorting Options
Use the Sort dropdown to organize classifications:
- Date (Newest first): Most recent classifications
- Date (Oldest first): Oldest classifications
- Sensitivity (High → Low): High sensitivity items first
- Sensitivity (Low → High): Low sensitivity items first
- Confidence (High → Low): Most confident classifications first
- Confidence (Low → High): Least confident classifications first
Low confidence typically indicates:
- Ambiguous column names (e.g., "AV", "FFF", "AF")
- Free-form text fields that may contain mixed content
- Custom business fields without clear patterns
Filter Options
Available filters in the activities view:
- Sensitivity Level dropdown: All Levels, High, Medium, Low, None
- Time Range dropdown: All Time, Today, Last 7 Days, Last 30 Days
- Table Filter dropdown: All Tables, or specific table selection
- Search box: Find specific activities by text
- Clear button: Reset all filters when active
Review Mode
Enable Review Mode to systematically validate and correct classifications:
Entering Review Mode
- Toggle the Review Mode switch in the Classification Activities header
- The interface switches to a table-grouped review layout
- Each table can be expanded to show its column classifications
Review Actions
For each classification, you can:
Approve Classification
- Click the Approve button (thumbs up icon)
- Confirms the AI classification is correct
- Moves to the next item automatically
Reject and Correct
- Click the Disapprove button (thumbs down icon)
- A popover opens with:
- Sensitivity Level dropdown (High, Medium, Low, None)
- Data Category dropdown (PII, PHI, PCI, HCI, General)
- Explanation text area (optional)
- Select the correct classification
- Click Save Correction
Review Features in Review Mode
When in review mode:
- Bulk selection: Select multiple items for batch approval
- Table grouping: Classifications grouped by table
- Expandable tables: Click chevron to expand/collapse table columns
- Pagination: Navigate through large result sets
- Filter controls: Sensitivity level, category, and review status filters
Machine Learning Feedback
Corrections are fed back into the classification algorithm:
- The system learns from your corrections
- Future classifications improve based on feedback
- Pattern recognition adapts to your data conventions
Best Practices
Regular Reviews
- Review classifications after initial scanning
- Focus on high-value or high-risk data first
- Periodically review low-confidence classifications
Handling Ambiguous Fields
- Document business-specific field meanings
- Provide clear corrections with explanations
- Consider renaming ambiguous columns for clarity
Compliance Considerations
- Ensure all PCI data is properly identified
- Verify PHI classifications for healthcare compliance
- Document classification decisions for audit trails
Integration with Other Features
Data Catalog
- Classifications appear inline while browsing
- Edit column metadata to update classifications
- Export classification reports
Access Control
- Use classifications to inform access policies
- Restrict access to highly sensitive data
- Monitor access to classified data
Data Quality
- Correlate sensitivity with data quality rules
- Apply stricter validation to sensitive fields
- Monitor changes to sensitive data
Troubleshooting
Common Issues
Issue | Cause | Solution |
---|---|---|
Missing classifications | Agent not associated with data source | Add Data Sensitivity Agent to data source |
Low confidence scores | Ambiguous column names | Review and correct, consider renaming columns |
Incorrect categories | Domain-specific data patterns | Provide corrections to train the model |
Classifications not updating | Cache delay | Wait for next scan or trigger manual refresh |
API Access
Classifications can be accessed programmatically:
- Query classifications by table or column
- Submit corrections via API
- Export classification reports
- Integrate with data governance tools