== Physical Plan ==
TakeOrderedAndProject (59)
+- * HashAggregate (58)
   +- Exchange (57)
      +- * HashAggregate (56)
         +- * HashAggregate (55)
            +- * HashAggregate (54)
               +- * Project (53)
                  +- * SortMergeJoin Inner (52)
                     :- * Sort (43)
                     :  +- * Project (42)
                     :     +- * BroadcastHashJoin Inner BuildLeft (41)
                     :        :- BroadcastExchange (10)
                     :        :  +- * Project (9)
                     :        :     +- * BroadcastHashJoin Inner BuildRight (8)
                     :        :        :- * Filter (3)
                     :        :        :  +- * ColumnarToRow (2)
                     :        :        :     +- Scan parquet spark_catalog.default.customer_address (1)
                     :        :        +- BroadcastExchange (7)
                     :        :           +- * Filter (6)
                     :        :              +- * ColumnarToRow (5)
                     :        :                 +- Scan parquet spark_catalog.default.store (4)
                     :        +- * HashAggregate (40)
                     :           +- * HashAggregate (39)
                     :              +- * Project (38)
                     :                 +- * SortMergeJoin Inner (37)
                     :                    :- * Sort (31)
                     :                    :  +- Exchange (30)
                     :                    :     +- * Project (29)
                     :                    :        +- * BroadcastHashJoin Inner BuildRight (28)
                     :                    :           :- * Project (22)
                     :                    :           :  +- * BroadcastHashJoin Inner BuildRight (21)
                     :                    :           :     :- Union (19)
                     :                    :           :     :  :- * Project (14)
                     :                    :           :     :  :  +- * Filter (13)
                     :                    :           :     :  :     +- * ColumnarToRow (12)
                     :                    :           :     :  :        +- Scan parquet spark_catalog.default.catalog_sales (11)
                     :                    :           :     :  +- * Project (18)
                     :                    :           :     :     +- * Filter (17)
                     :                    :           :     :        +- * ColumnarToRow (16)
                     :                    :           :     :           +- Scan parquet spark_catalog.default.web_sales (15)
                     :                    :           :     +- ReusedExchange (20)
                     :                    :           +- BroadcastExchange (27)
                     :                    :              +- * Project (26)
                     :                    :                 +- * Filter (25)
                     :                    :                    +- * ColumnarToRow (24)
                     :                    :                       +- Scan parquet spark_catalog.default.item (23)
                     :                    +- * Sort (36)
                     :                       +- Exchange (35)
                     :                          +- * Filter (34)
                     :                             +- * ColumnarToRow (33)
                     :                                +- Scan parquet spark_catalog.default.customer (32)
                     +- * Sort (51)
                        +- Exchange (50)
                           +- * Project (49)
                              +- * BroadcastHashJoin Inner BuildRight (48)
                                 :- * Filter (46)
                                 :  +- * ColumnarToRow (45)
                                 :     +- Scan parquet spark_catalog.default.store_sales (44)
                                 +- ReusedExchange (47)


(1) Scan parquet spark_catalog.default.customer_address
Output [3]: [ca_address_sk#1, ca_county#2, ca_state#3]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_county), IsNotNull(ca_state)]
ReadSchema: struct<ca_address_sk:int,ca_county:string,ca_state:string>

(2) ColumnarToRow [codegen id : 2]
Input [3]: [ca_address_sk#1, ca_county#2, ca_state#3]

(3) Filter [codegen id : 2]
Input [3]: [ca_address_sk#1, ca_county#2, ca_state#3]
Condition : ((isnotnull(ca_address_sk#1) AND isnotnull(ca_county#2)) AND isnotnull(ca_state#3))

(4) Scan parquet spark_catalog.default.store
Output [2]: [s_county#4, s_state#5]
Batched: true
Location [not included in comparison]/{warehouse_dir}/store]
PushedFilters: [IsNotNull(s_county), IsNotNull(s_state)]
ReadSchema: struct<s_county:string,s_state:string>

(5) ColumnarToRow [codegen id : 1]
Input [2]: [s_county#4, s_state#5]

(6) Filter [codegen id : 1]
Input [2]: [s_county#4, s_state#5]
Condition : (isnotnull(s_county#4) AND isnotnull(s_state#5))

(7) BroadcastExchange
Input [2]: [s_county#4, s_state#5]
Arguments: HashedRelationBroadcastMode(List(input[0, string, false], input[1, string, false]),false), [plan_id=1]

(8) BroadcastHashJoin [codegen id : 2]
Left keys [2]: [ca_county#2, ca_state#3]
Right keys [2]: [s_county#4, s_state#5]
Join type: Inner
Join condition: None

(9) Project [codegen id : 2]
Output [1]: [ca_address_sk#1]
Input [5]: [ca_address_sk#1, ca_county#2, ca_state#3, s_county#4, s_state#5]

(10) BroadcastExchange
Input [1]: [ca_address_sk#1]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=2]

(11) Scan parquet spark_catalog.default.catalog_sales
Output [3]: [cs_bill_customer_sk#6, cs_item_sk#7, cs_sold_date_sk#8]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(cs_sold_date_sk#8), dynamicpruningexpression(cs_sold_date_sk#8 IN dynamicpruning#9)]
PushedFilters: [IsNotNull(cs_item_sk), IsNotNull(cs_bill_customer_sk)]
ReadSchema: struct<cs_bill_customer_sk:int,cs_item_sk:int>

(12) ColumnarToRow [codegen id : 3]
Input [3]: [cs_bill_customer_sk#6, cs_item_sk#7, cs_sold_date_sk#8]

(13) Filter [codegen id : 3]
Input [3]: [cs_bill_customer_sk#6, cs_item_sk#7, cs_sold_date_sk#8]
Condition : (isnotnull(cs_item_sk#7) AND isnotnull(cs_bill_customer_sk#6))

(14) Project [codegen id : 3]
Output [3]: [cs_sold_date_sk#8 AS sold_date_sk#10, cs_bill_customer_sk#6 AS customer_sk#11, cs_item_sk#7 AS item_sk#12]
Input [3]: [cs_bill_customer_sk#6, cs_item_sk#7, cs_sold_date_sk#8]

(15) Scan parquet spark_catalog.default.web_sales
Output [3]: [ws_item_sk#13, ws_bill_customer_sk#14, ws_sold_date_sk#15]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(ws_sold_date_sk#15), dynamicpruningexpression(ws_sold_date_sk#15 IN dynamicpruning#9)]
PushedFilters: [IsNotNull(ws_item_sk), IsNotNull(ws_bill_customer_sk)]
ReadSchema: struct<ws_item_sk:int,ws_bill_customer_sk:int>

(16) ColumnarToRow [codegen id : 4]
Input [3]: [ws_item_sk#13, ws_bill_customer_sk#14, ws_sold_date_sk#15]

(17) Filter [codegen id : 4]
Input [3]: [ws_item_sk#13, ws_bill_customer_sk#14, ws_sold_date_sk#15]
Condition : (isnotnull(ws_item_sk#13) AND isnotnull(ws_bill_customer_sk#14))

(18) Project [codegen id : 4]
Output [3]: [ws_sold_date_sk#15 AS sold_date_sk#16, ws_bill_customer_sk#14 AS customer_sk#17, ws_item_sk#13 AS item_sk#18]
Input [3]: [ws_item_sk#13, ws_bill_customer_sk#14, ws_sold_date_sk#15]

(19) Union

(20) ReusedExchange [Reuses operator id: 64]
Output [1]: [d_date_sk#19]

(21) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [sold_date_sk#10]
Right keys [1]: [d_date_sk#19]
Join type: Inner
Join condition: None

(22) Project [codegen id : 7]
Output [2]: [customer_sk#11, item_sk#12]
Input [4]: [sold_date_sk#10, customer_sk#11, item_sk#12, d_date_sk#19]

(23) Scan parquet spark_catalog.default.item
Output [3]: [i_item_sk#20, i_class#21, i_category#22]
Batched: true
Location [not included in comparison]/{warehouse_dir}/item]
PushedFilters: [IsNotNull(i_category), IsNotNull(i_class), EqualTo(i_category,Women                                             ), EqualTo(i_class,maternity                                         ), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_class:string,i_category:string>

(24) ColumnarToRow [codegen id : 6]
Input [3]: [i_item_sk#20, i_class#21, i_category#22]

(25) Filter [codegen id : 6]
Input [3]: [i_item_sk#20, i_class#21, i_category#22]
Condition : ((((isnotnull(i_category#22) AND isnotnull(i_class#21)) AND (i_category#22 = Women                                             )) AND (i_class#21 = maternity                                         )) AND isnotnull(i_item_sk#20))

(26) Project [codegen id : 6]
Output [1]: [i_item_sk#20]
Input [3]: [i_item_sk#20, i_class#21, i_category#22]

(27) BroadcastExchange
Input [1]: [i_item_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=3]

(28) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [item_sk#12]
Right keys [1]: [i_item_sk#20]
Join type: Inner
Join condition: None

(29) Project [codegen id : 7]
Output [1]: [customer_sk#11]
Input [3]: [customer_sk#11, item_sk#12, i_item_sk#20]

(30) Exchange
Input [1]: [customer_sk#11]
Arguments: hashpartitioning(customer_sk#11, 5), ENSURE_REQUIREMENTS, [plan_id=4]

(31) Sort [codegen id : 8]
Input [1]: [customer_sk#11]
Arguments: [customer_sk#11 ASC NULLS FIRST], false, 0

(32) Scan parquet spark_catalog.default.customer
Output [2]: [c_customer_sk#23, c_current_addr_sk#24]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int>

(33) ColumnarToRow [codegen id : 9]
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]

(34) Filter [codegen id : 9]
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]
Condition : (isnotnull(c_customer_sk#23) AND isnotnull(c_current_addr_sk#24))

(35) Exchange
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]
Arguments: hashpartitioning(c_customer_sk#23, 5), ENSURE_REQUIREMENTS, [plan_id=5]

(36) Sort [codegen id : 10]
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]
Arguments: [c_customer_sk#23 ASC NULLS FIRST], false, 0

(37) SortMergeJoin
Left keys [1]: [customer_sk#11]
Right keys [1]: [c_customer_sk#23]
Join type: Inner
Join condition: None

(38) Project
Output [2]: [c_customer_sk#23, c_current_addr_sk#24]
Input [3]: [customer_sk#11, c_customer_sk#23, c_current_addr_sk#24]

(39) HashAggregate
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]
Keys [2]: [c_customer_sk#23, c_current_addr_sk#24]
Functions: []
Aggregate Attributes: []
Results [2]: [c_customer_sk#23, c_current_addr_sk#24]

(40) HashAggregate
Input [2]: [c_customer_sk#23, c_current_addr_sk#24]
Keys [2]: [c_customer_sk#23, c_current_addr_sk#24]
Functions: []
Aggregate Attributes: []
Results [2]: [c_customer_sk#23, c_current_addr_sk#24]

(41) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ca_address_sk#1]
Right keys [1]: [c_current_addr_sk#24]
Join type: Inner
Join condition: None

(42) Project [codegen id : 11]
Output [1]: [c_customer_sk#23]
Input [3]: [ca_address_sk#1, c_customer_sk#23, c_current_addr_sk#24]

(43) Sort [codegen id : 11]
Input [1]: [c_customer_sk#23]
Arguments: [c_customer_sk#23 ASC NULLS FIRST], false, 0

(44) Scan parquet spark_catalog.default.store_sales
Output [3]: [ss_customer_sk#25, ss_ext_sales_price#26, ss_sold_date_sk#27]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(ss_sold_date_sk#27), dynamicpruningexpression(ss_sold_date_sk#27 IN dynamicpruning#28)]
PushedFilters: [IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_customer_sk:int,ss_ext_sales_price:decimal(7,2)>

(45) ColumnarToRow [codegen id : 13]
Input [3]: [ss_customer_sk#25, ss_ext_sales_price#26, ss_sold_date_sk#27]

(46) Filter [codegen id : 13]
Input [3]: [ss_customer_sk#25, ss_ext_sales_price#26, ss_sold_date_sk#27]
Condition : isnotnull(ss_customer_sk#25)

(47) ReusedExchange [Reuses operator id: 69]
Output [1]: [d_date_sk#29]

(48) BroadcastHashJoin [codegen id : 13]
Left keys [1]: [ss_sold_date_sk#27]
Right keys [1]: [d_date_sk#29]
Join type: Inner
Join condition: None

(49) Project [codegen id : 13]
Output [2]: [ss_customer_sk#25, ss_ext_sales_price#26]
Input [4]: [ss_customer_sk#25, ss_ext_sales_price#26, ss_sold_date_sk#27, d_date_sk#29]

(50) Exchange
Input [2]: [ss_customer_sk#25, ss_ext_sales_price#26]
Arguments: hashpartitioning(ss_customer_sk#25, 5), ENSURE_REQUIREMENTS, [plan_id=6]

(51) Sort [codegen id : 14]
Input [2]: [ss_customer_sk#25, ss_ext_sales_price#26]
Arguments: [ss_customer_sk#25 ASC NULLS FIRST], false, 0

(52) SortMergeJoin [codegen id : 15]
Left keys [1]: [c_customer_sk#23]
Right keys [1]: [ss_customer_sk#25]
Join type: Inner
Join condition: None

(53) Project [codegen id : 15]
Output [2]: [c_customer_sk#23, ss_ext_sales_price#26]
Input [3]: [c_customer_sk#23, ss_customer_sk#25, ss_ext_sales_price#26]

(54) HashAggregate [codegen id : 15]
Input [2]: [c_customer_sk#23, ss_ext_sales_price#26]
Keys [1]: [c_customer_sk#23]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#26))]
Aggregate Attributes [1]: [sum#30]
Results [2]: [c_customer_sk#23, sum#31]

(55) HashAggregate [codegen id : 15]
Input [2]: [c_customer_sk#23, sum#31]
Keys [1]: [c_customer_sk#23]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#26))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#26))#32]
Results [1]: [cast((MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#26))#32,17,2) / 50) as int) AS segment#33]

(56) HashAggregate [codegen id : 15]
Input [1]: [segment#33]
Keys [1]: [segment#33]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#34]
Results [2]: [segment#33, count#35]

(57) Exchange
Input [2]: [segment#33, count#35]
Arguments: hashpartitioning(segment#33, 5), ENSURE_REQUIREMENTS, [plan_id=7]

(58) HashAggregate [codegen id : 16]
Input [2]: [segment#33, count#35]
Keys [1]: [segment#33]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#36]
Results [3]: [segment#33, count(1)#36 AS num_customers#37, (segment#33 * 50) AS segment_base#38]

(59) TakeOrderedAndProject
Input [3]: [segment#33, num_customers#37, segment_base#38]
Arguments: 100, [segment#33 ASC NULLS FIRST, num_customers#37 ASC NULLS FIRST], [segment#33, num_customers#37, segment_base#38]

===== Subqueries =====

Subquery:1 Hosting operator id = 11 Hosting Expression = cs_sold_date_sk#8 IN dynamicpruning#9
BroadcastExchange (64)
+- * Project (63)
   +- * Filter (62)
      +- * ColumnarToRow (61)
         +- Scan parquet spark_catalog.default.date_dim (60)


(60) Scan parquet spark_catalog.default.date_dim
Output [3]: [d_date_sk#19, d_year#39, d_moy#40]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,12), EqualTo(d_year,1998), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>

(61) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#19, d_year#39, d_moy#40]

(62) Filter [codegen id : 1]
Input [3]: [d_date_sk#19, d_year#39, d_moy#40]
Condition : ((((isnotnull(d_moy#40) AND isnotnull(d_year#39)) AND (d_moy#40 = 12)) AND (d_year#39 = 1998)) AND isnotnull(d_date_sk#19))

(63) Project [codegen id : 1]
Output [1]: [d_date_sk#19]
Input [3]: [d_date_sk#19, d_year#39, d_moy#40]

(64) BroadcastExchange
Input [1]: [d_date_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=8]

Subquery:2 Hosting operator id = 15 Hosting Expression = ws_sold_date_sk#15 IN dynamicpruning#9

Subquery:3 Hosting operator id = 44 Hosting Expression = ss_sold_date_sk#27 IN dynamicpruning#28
BroadcastExchange (69)
+- * Project (68)
   +- * Filter (67)
      +- * ColumnarToRow (66)
         +- Scan parquet spark_catalog.default.date_dim (65)


(65) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#29, d_month_seq#41]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,ScalarSubquery#42), LessThanOrEqual(d_month_seq,ScalarSubquery#43), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int>

(66) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#29, d_month_seq#41]

(67) Filter [codegen id : 1]
Input [2]: [d_date_sk#29, d_month_seq#41]
Condition : (((isnotnull(d_month_seq#41) AND (d_month_seq#41 >= ReusedSubquery Subquery scalar-subquery#42, [id=#44])) AND (d_month_seq#41 <= ReusedSubquery Subquery scalar-subquery#43, [id=#45])) AND isnotnull(d_date_sk#29))

(68) Project [codegen id : 1]
Output [1]: [d_date_sk#29]
Input [2]: [d_date_sk#29, d_month_seq#41]

(69) BroadcastExchange
Input [1]: [d_date_sk#29]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=9]

Subquery:4 Hosting operator id = 67 Hosting Expression = ReusedSubquery Subquery scalar-subquery#42, [id=#44]

Subquery:5 Hosting operator id = 67 Hosting Expression = ReusedSubquery Subquery scalar-subquery#43, [id=#45]

Subquery:6 Hosting operator id = 65 Hosting Expression = Subquery scalar-subquery#42, [id=#44]
* HashAggregate (76)
+- Exchange (75)
   +- * HashAggregate (74)
      +- * Project (73)
         +- * Filter (72)
            +- * ColumnarToRow (71)
               +- Scan parquet spark_catalog.default.date_dim (70)


(70) Scan parquet spark_catalog.default.date_dim
Output [3]: [d_month_seq#46, d_year#47, d_moy#48]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,1998), EqualTo(d_moy,12)]
ReadSchema: struct<d_month_seq:int,d_year:int,d_moy:int>

(71) ColumnarToRow [codegen id : 1]
Input [3]: [d_month_seq#46, d_year#47, d_moy#48]

(72) Filter [codegen id : 1]
Input [3]: [d_month_seq#46, d_year#47, d_moy#48]
Condition : (((isnotnull(d_year#47) AND isnotnull(d_moy#48)) AND (d_year#47 = 1998)) AND (d_moy#48 = 12))

(73) Project [codegen id : 1]
Output [1]: [(d_month_seq#46 + 1) AS (d_month_seq + 1)#49]
Input [3]: [d_month_seq#46, d_year#47, d_moy#48]

(74) HashAggregate [codegen id : 1]
Input [1]: [(d_month_seq + 1)#49]
Keys [1]: [(d_month_seq + 1)#49]
Functions: []
Aggregate Attributes: []
Results [1]: [(d_month_seq + 1)#49]

(75) Exchange
Input [1]: [(d_month_seq + 1)#49]
Arguments: hashpartitioning((d_month_seq + 1)#49, 5), ENSURE_REQUIREMENTS, [plan_id=10]

(76) HashAggregate [codegen id : 2]
Input [1]: [(d_month_seq + 1)#49]
Keys [1]: [(d_month_seq + 1)#49]
Functions: []
Aggregate Attributes: []
Results [1]: [(d_month_seq + 1)#49]

Subquery:7 Hosting operator id = 65 Hosting Expression = Subquery scalar-subquery#43, [id=#45]
* HashAggregate (83)
+- Exchange (82)
   +- * HashAggregate (81)
      +- * Project (80)
         +- * Filter (79)
            +- * ColumnarToRow (78)
               +- Scan parquet spark_catalog.default.date_dim (77)


(77) Scan parquet spark_catalog.default.date_dim
Output [3]: [d_month_seq#50, d_year#51, d_moy#52]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,1998), EqualTo(d_moy,12)]
ReadSchema: struct<d_month_seq:int,d_year:int,d_moy:int>

(78) ColumnarToRow [codegen id : 1]
Input [3]: [d_month_seq#50, d_year#51, d_moy#52]

(79) Filter [codegen id : 1]
Input [3]: [d_month_seq#50, d_year#51, d_moy#52]
Condition : (((isnotnull(d_year#51) AND isnotnull(d_moy#52)) AND (d_year#51 = 1998)) AND (d_moy#52 = 12))

(80) Project [codegen id : 1]
Output [1]: [(d_month_seq#50 + 3) AS (d_month_seq + 3)#53]
Input [3]: [d_month_seq#50, d_year#51, d_moy#52]

(81) HashAggregate [codegen id : 1]
Input [1]: [(d_month_seq + 3)#53]
Keys [1]: [(d_month_seq + 3)#53]
Functions: []
Aggregate Attributes: []
Results [1]: [(d_month_seq + 3)#53]

(82) Exchange
Input [1]: [(d_month_seq + 3)#53]
Arguments: hashpartitioning((d_month_seq + 3)#53, 5), ENSURE_REQUIREMENTS, [plan_id=11]

(83) HashAggregate [codegen id : 2]
Input [1]: [(d_month_seq + 3)#53]
Keys [1]: [(d_month_seq + 3)#53]
Functions: []
Aggregate Attributes: []
Results [1]: [(d_month_seq + 3)#53]


