기사

Announcing Teradata on dbt Cloud

Together, Teradata and dbt Cloud provide best-in-class data management with cutting-edge data transformation.

2024년 10월 8일 4 최소 읽기

Data transformation tool dbt (data build tool) is the cornerstone of the modern data stack. Building on the widely adopted dbt-teradata adapter, we’re excited to share that the Teradata platform, including Teradata VantageCloud, the complete cloud analytics and data platform for AI, is now integrated with dbt Cloud. 

dbt Cloud is a natural next step in the ease of use of dbt within VantageCloud environments. This integration facilitates the creation and maintenance of data pipelines, improving data management and empowering teams to transform data at scale.  

dbt Cloud enables data and analytics engineers to be more productive when developing, implementing, and maintaining analytics workflows by offering: 

  • Fully managed infrastructure. This removes the need for local setup and maintenance, allowing teams to focus on analytics tasks rather than time-consuming operational tasks. 
  • Web-based user interface. Users can transform data without writing code beyond SQL. dbt Cloud offers pre-built macros and templates that automate repetitive tasks like date formatting and incremental loads—enabling business users to work more efficiently, with minimal knowledge of SQL. 
  • Scheduler and CI. Users can set schedules for their dbt jobs and monitor performance with built-in tools for orchestration, making it easy to automate and manage data transformation at scale. 
  • Git-enabled version control. dbt Cloud integrates seamlessly with Git for version control and collaborative workflows. Multiple users can work on the same models simultaneously, track changes, and run CI/CD pipelines, ensuring quality and reliability. 
  • Security, project governance, and alerts. Advanced features such as role-based access control (RBAC), logging, audit trails, and alerting make dbt Cloud suitable for large organizations with strict governance needs. 
  • Data governance and collaboration tools. dbt Cloud offers dbt Mesh, Semantic Layer, and Cloud API for enabling cross-project lineage and data sharing, defining metrics and dimensions, and facilitating integration with other data tools and processes. 

Teradata + dbt Cloud: The ultimate data management and transformation platform 

Teradata’s ability to manage complex large-scale datasets with enterprise-grade reliability and robust security and governance standards, combined with dbt’s powerful transformation tools to turn raw data into actionable insights, enables unparalleled data management and agile analytics, empowering teams to transform data at scale. 

dbt Cloud’s web-based UI, CI/CD automation, scheduling, and collaborative environment simplify the transformation process, allowing teams to deliver faster and more consistent analytics with reduced manual effort. 

Together, Teradata and dbt Cloud provide best-in-class data management with cutting-edge data transformation, helping organizations accelerate their analytics and AI/ML initiatives. 

Hands-on sample dbt project demonstrating dbt Cloud with Teradata 

We’ve prepared a sample project for experimenting with this integration to demonstrate how to take advantage of dbt Cloud for data transformations within VantageCloud. 

The sample project showcases converting raw data from a database (source) into a dimensional model and preparing customer and purchase data for analytics. The data pipeline is composed of multiple stages, beginning with the ingestion of raw data through dbt seed. 

The data is ingested to staging tables and then transformed through dbt models to: 

  • Summarize customer data, such as age and nationality, facilitating demographic analysis and targeted marketing efforts 
  • Produce aggregations on product purchase frequency, delivering insights into product demand to inform inventory and marketing strategies 
  • Analyze customer purchase behavior over time, revealing buying trends and optimizing the customer journey 

The data flow of the project is reflected in the ERD diagram below. While details of the specific SQL transformations are out of scope for this article, they can be examined in the accompanying dbt project repository

Prerequisites 

To follow along with the example, you’ll need: 

Quick-start guide 

Our quickstart guide contains detailed instructions for: 

  • Creating a new dbt Cloud project 
  • Connecting to a VantageCloud environment from dbt Cloud 
  • Configuring a dbt Cloud environment  
  • Setting up a GitHub repository in dbt Cloud 
  • Developing and running your dbt project in dbt Cloud’s Cloud Integrated Development Experience (IDE) 

After setting up your project, connection, and environment according to the guide, you can experience the IDE. 

Running the dbt project from dbt Cloud IDE 

1. Below you can view dbt Cloud IDE. Access your dashboard (which shows all projects), development page (where you can access all project-specific models), and deployments (where you can find all jobs, environments, and data sources). You can also access other features like support, help/guides, and account settings. 

2. From the Develop tab, access the “file explorer” section to browse through the project. Build, test, and run projects using the Cloud IDE command bar. 

3. After you create an environment in which to run the project and run it, the job starts. You can view various stages of the job along with run times. Expanding these summaries provides access to console and debug logs, which can be downloaded. 

4. Later, you can use the tables transformed using dbt for analytics or AI/ML use cases. 

Derive actionable insights from your data 

The integration of VantageCloud with dbt Cloud provides data and analytics engineers with a robust set of tools to streamline workflows, deriving actionable insights more efficiently from data. 

The introduction of Cloud UI, job scheduling, alerting, browser-based development, CI/CD integration, and environment management ensures that teams can handle data transformations and deployments with greater ease and reliability. These advancements are crucial as organizations continue to become more data driven, necessitating efficient and automated data workflows. 

Join us in leveraging these powerful tools to enhance your data analytics and AI initiatives. 

Tags

약 Mohan Talla

Mohan Talla is a senior software engineer at Teradata. Mohan developed Teradata Connector for Hadoop (TDCH) and dbt-teradata, and he played a key role in developing the Teradata adapter for dbt.

모든 게시물 보기Mohan Talla

약 Varun Sharma

Varun Sharma is a software engineer at Teradata. Sharma developed the Teradata adapter for dbt and has also worked on projects like VCM in Teradata.

모든 게시물 보기Varun Sharma

알고 있어

테라데이트의 블로그를 구독하여 주간 통찰력을 얻을 수 있습니다



I consent that Teradata Corporation, as provider of this website, may occasionally send me Teradata Marketing Communications emails with information regarding products, data analytics, and event and webinar invitations. I understand that I may unsubscribe at any time by following the unsubscribe link at the bottom of any email I receive.

Your privacy is important. Your personal information will be collected, stored, and processed in accordance with the Teradata Global Privacy Statement.