Query Optimization in Multi-Cluster Architecture
Teradata’s query optimizer designs efficient query plans while maintaining techniques gained in the enterprise world.
The Teradata query optimizer has had a rough upbringing: Through decades of balancing ever-growing demands, adjusting to the needs of new database features, stretching to support innovative performance techniques, and reaching out to new data sources, the optimizer has met challenge after challenge—and come out stronger.
When VantageCloud Lake joined the Teradata family, the optimizer confronted yet another intriguing challenge: How to design efficient query plans that span more than one cluster while maintaining accumulated techniques gained from coming of age in the enterprise world.
Query optimizer: Deep dive into multiple-cluster architecture
Before a query begins to execute on any Teradata system, the optimizer formulates a plan. It examines many different approaches for executing the query and picks the plan that is likely to require the fewest resources. To reach that goal, the optimizer distills the request into chunks of work called “query steps,” then details and sequences those steps in an optimal way.
Each step is sent to the parallel units of the current cluster (called AMPs) independently, with each AMP working on its share of the query step in parallel with other AMPs. When each step is complete, the next step is sent to the AMPs.
Data needed by the query may be in different files or tables, so a query of even moderate complexity may involve thousands of tough choices. The optimizer faces a dizzying number of join sequences, join types, and join geographies as the number of tables in the query grows.
# of Tables |
Possible Join Orders |
Grains of Sand |
3 |
12 |
. . . . . . . . . . . . |
4 |
120 |
.................................................. |
10 |
1.8 * 1010 |
6 sandboxes |
16 |
2 * 1020 |
Sand in all the beaches and |
Multi-cluster architecture in VantageCloud Lake
Teradata Vantage™ started life building query plans for fixed-size, massively parallel enterprise systems, designed to run on a single cluster. On this enterprise platform, only the AMPs within that single cluster receive query steps and execute them. When planning on an enterprise platform, the optimizer always knows where the query steps are going to run.
But VantageCloud Lake has a different architecture. It incorporates enterprise and compute clusters. A single query can be executed completely on an enterprise cluster, or it can start its execution on an enterprise cluster, then execute some or most of the remaining query steps on a compute cluster.
With more than one type of cluster available with VantageCloud Lake, what determines which steps run on which cluster?
Global Planner: Managing multi-cluster workloads
To support the multi-cluster architecture of VantageCloud Lake, Teradata introduced a new component called the Global Planner into the optimizer. This extended version of the optimizer lives in an enterprise cluster, where all query parsing and query planning occurs. The Global Planner determines which steps in a query will run on the enterprise and compute clusters.The Global Planner always starts and completes all queries on the primary cluster. If a compute cluster is an option, the Global Planner ensures that query steps eligible to be executed on the compute cluster are bundled into a single set of directives. In its initial version, the Global Planner only builds plans that make a single round trip from primary to compute cluster and back.
Enterprise cluster vs. compute cluster
Before the detailed query plan is built, the Global Planner determines which steps will run on which cluster, in part by examining the type of storage a particular query step is accessing. VantageCloud Lake supports four tiers of storage:
- Block file system on the primary cluster: Data is positioned across AMPs within the cluster
- Object file system: A proprietary object storage file system private to the VantageCloud Lake tenant
- Object file format: Shared object storage accessed using Native Object Store (NOS) foreign tables
- Object table format: Shared open table formats
Steps that access block file system storage will always be assigned to the enterprise cluster. However, steps that access one of the three tiers of object storage will be assigned to run on a compute cluster. This assumes that compute clusters are available and the user who submitted the query has the appropriate privileges. A single VantageCloud Lake query may access data from multiple storage tiers in a single query, making the optimizer’s task much more interesting.
For query steps that don’t access data—like steps that perform functions, aggregations, or joins—the optimizer’s established cost-based assessments are used to assign a step to the cluster that will run it at the lowest cost. The cheaper option wins. To assess comparative costs for a step, the Global Planner looks at available demographics, such as:
- The number of AMPs on different clusters
- The processing power of the nodes
- The number of expected input and output rows of each step
- Other demographics derived from detailed statistics collected on tables that make up the query
After step-to-cluster assignments have been made, standard query optimization takes place using all available techniques that have evolved over the past years, including query rewrite, joining of relational and non-relational data, and specialized performance routines. After the query plan is complete, the finalized plan is separated into two distinct sets of directives: One to be run on an enterprise cluster and one to be run on a compute cluster.
When the query runs, the file containing compute cluster steps is sent to the compute cluster along with other relevant data, such as temporary answer sets. An internal communication layer (query fabric) is in place for inter-cluster communications.
Adaptive optimization
The Teradata optimizer is still a bit scrappy, constantly pushing the envelope. Soon, Teradata’s Adaptive Optimizer will begin improving the quality of complex query plans produced in VantageCloud Lake. Using this capability, a group of steps can be sent to one compute cluster to execute a fragment of the query. Results from the first fragment return to the primary cluster, where they act as input for more accurate planning for the next fragment, which may be executed on a different type of cluster.
In addition, specialized compute clusters are evolving in VantageCloud Lake to support focused operations within a query. For instance, the optimizer could choose to run steps with heavy calculations on graphics processing unit (GPU) clusters while sending steps that execute in-database analytics to analytic clusters, all within the same query.
Teradata’s query optimizer is proof that experience builds excellence. Learn more about collecting, exploring, and transforming data with VantageCloud Lake.
알고 있어
테라데이트의 블로그를 구독하여 주간 통찰력을 얻을 수 있습니다