Project Overview

LastPass manages massive volumes of sensitive credentials, billing, and user activity data across numerous enterprise systems. As the platform scaled, its Databricks environment began experiencing pipeline instability, fragmented datasets, and governance gaps around sensitive information. Closeloop’s data engineering team refactored the architecture using a Medallion data model (Bronze, Silver, Gold) and introduced DBT for transformation, documentation, and lineage tracking. The modernized platform improved ingestion reliability, secured PII data handling, standardized pipelines, and enabled scalable analytics across integrated systems such as Salesforce, Marketo, and AWS S3.

Business Challenges

Fragmented datasets and inconsistent catalog structures across multiple enterprise systems.
High ingestion latency and frequent pipeline failures during peak data loads.
Lack of segregation and governance controls for sensitive PII data.
Limited documentation and unclear ownership of existing data pipelines.
Performance bottlenecks caused by inefficient transformations and schema inconsistencies.

Solution

Implemented a scalable Medallion architecture (Bronze, Silver, Gold) within Databricks.
Introduced DBT for structured transformations, schema documentation, and lineage tracking.
Segregated PII and non-PII datasets with role-based access controls.
Rebuilt ingestion pipelines using PySpark with validation checkpoints.
Standardized catalogs, schemas, and naming conventions across datasets.

Approach

Conducted gap analysis to identify ingestion and transformation bottlenecks.
Reorganized legacy datasets into Medallion layers aligned with best practices.
Rebuilt ETL pipelines using PySpark and SQL on Databricks clusters.
Implemented SCD-II logic through DBT snapshots for historical data tracking.
Added monitoring, validation checkpoints, and automated alerts for pipeline reliability.

Technology Stack

Result & Benefits

50%

Faster Data Ingestion

Optimized pipelines reduced ingestion time through parallel processing and improved job orchestration.

47%

Improved Query Performance

Refactored transformations and structured data layers accelerated analytics queries and reporting.

99.8%

Pipeline Success Rate

Validation checkpoints and monitoring improved pipeline reliability and reduced manual intervention.

Contact Us

By submitting this form you acknowledge that you have read Closeloop's Privacy Policy and agree to its terms.

Databricks Data Platform Modernization – LastPass
Data Engineering & Analytics

Project Overview

Business Challenges

Solution

Approach

Tags

Technology Stack

Result & Benefits

50%

Faster Data Ingestion

47%

Improved Query Performance

99.8%

Pipeline Success Rate

Contact Us

Databricks Data Platform Modernization – LastPass Data Engineering & Analytics

Project Overview

Business Challenges

Solution

Approach

Tags

Technology Stack

Result & Benefits

50%

Faster Data Ingestion

47%

Improved Query Performance

99.8%

Pipeline Success Rate

Contact Us

Databricks Data Platform Modernization – LastPass
Data Engineering & Analytics