0% found this document useful (0 votes)

19 views18 pages

IoTDBS 2024 Presentation

Uploaded by

jq4q7z6nny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views18 pages

IoTDBS 2024 Presentation

Uploaded by

jq4q7z6nny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Sample-Based

Cardinality
Estimation in Full
Outer Join Queries
April 28-30, 2024
Onsite Presentation

Uriy Grigorev, Olga Pluzhnikova , Evgeny Detkov

Bauman Moscow State Technical University, BMSTU
Moscow, Russia
e-mail: [email protected], [email protected],
[email protected]

Andrey Ploutenko , Aleksey Burdakov

Amur State University, AmSU
Blagoveschensk , Russia
e-mail: [email protected], [email protected]
Problem Statement
A general database query:

SELECT attributes, aggregates

FROM T 1 JOIN T 2 ON ... JOIN T m ON ...
WHERE (condition on T 1 ) AND (condition on T 2 ) ... AND (condition on T m )
AND IN (select ... from ... where ...)
AND EXISTS (select ... from ... where ...)
AND NOT EXISTS (select ... from ... where ...)
GROUP BY .... ORDER BY ...;

U= Q 1 Q 2 ... Q m ,

where Q i = (select attr i from Ti where condition by T i ) – reading from original tables (subqueries).

Main task (in the future): development of an algorithm for selecting a query execution plan
and a cost model for queries with a large number of tables in the connection (m  100).

Current task (topic of the presentation): development of a method for estimating the
cardinality of intermediate tables ( IT) : join U subqueries Q = (Q 1 , Q 2 , ..., Q m ) and
subplans ( Q i 1 , Q i 2 , ... , Q ik )  Q .

3/8/17 2
Estimating cardinality (number of records) plays a key
role in creating efficient query plans in large RDBs
IMDB dataset , 113 queries (3 to 16 joins) 1 .
Table 1. Impact of the accuracy of cardinality estimation of intermediate tables (IT) 1 .

For 35.6%, 22%, 47.1%, 25.4%, 33.2% of requests, the execution time exceeded more than 2
times for the corresponding DBMS. A considerable percentage of queries with execution time
exceeding two orders of magnitude (>100). The problem is still relevant.

1 Leis
V. et al . How good are query optimizers, really? //Proceedings of the VLDB Endowment. –
2015. – T. 9. – no. 3. – P. 204-215
4/23/2 3 3
Existing Methods for Estimating the
Cardinality of a Staging Table
1. Histograms and samples
• widely used in DBMSs
• usually based on simplified assumptions and expert-developed heuristics
2. Query-based machine learning (ML)
• attempt to train a model to estimate Card(T,Q) from a query
• aome advanced ML methods improve performance by using more complex models, e.g.:
• deep neural networks (DNNs)
• gradient boosted trees
3. Data-driven ML methods
• query-agnostic
• treat each tuple in T as a point chosen according to the joint distribution
P T ( A )= P T ( A 1 , A 2 , . . . , A k ). Let P T (Q) = P T ( A 1 ∈ R 1 ∧ A 2 ∈ R 2 ∧...∧ A k ∈ R k ) -
probability that corresponds to the query Q ==> Card(T,Q)= P T (Q)  |T|

Current open source and commercial DBMSs primarily use two traditional CardEst methods:
• histograms in PostgreSQL and MS SQL Server
• sampling in MySQL and MariaDB

4/23/2 3 3
Existing Cardinality Estimation Methods
Disadvantages
Disadvantage Method

The cardinality is estimated for each subplan, so the evaluation time is large and all
proportional to the number of subplans of the original query
Correlations between selectivity and connectivity attributes are not taken into account histogram-based

Requires indexes on foreign keys of connections sample-based

Only table joins based on attribute equality are considered all

The tables to be joined must form an acyclic graph ML-based

Simplified premises or “magic” numbers are used when analyzing complex table all
filtering conditions ('!=', LIKE , etc.)
The cardinality estimate degrades as the number of joined tables increases all

The justification of the methods is given at the level of heuristics all

STATS test , query Q 57 (see below), subquery {1,5,3,4,2},

subquery cardinality assessment by explain ( PostgreSQL ) - 125,416 records,
7 hours (clockwise) at night the subquery has not yet been executed in PostgreSQL 15,
the actual cardinality of the subquery is 1,375,709,726,310 records.
4/23/21
5
Full Outer Join (FJJ) tables
(Q1 , Q2 , ..., Qm)
(1)
C ( Q ) - cardinality of joining tables ( Q 1 , Q 2 , ..., Q m ), F – the number of lines in the PVS, value 1 Qj , i
is 0, if in the i -th row the PVS =( Q 1 ⊲⊳ Q 2 ... ⊲⊳ Q m ) the attributes of some Q j are equal to the
empty symbol  (there is no connection with the record from Q j ), otherwise it is equal to 1.
An example of a theta join. SELECT * FROM Q1, Q2, Q3 WHERE Q1.A1= Q2.A1 and Q1.A2>=
Q2.A2 and Q2.A3!= Q3.A3;
1 Qj , i =1;

navigation;

C ( Q )=6:
(2,12)-(2,10,33)-(23)
(2,12)-(2,10,33)-(23)
(4,10)-(4,9,33)-(23)
(4,10)-(4,9,33)-(23)
(9,33)-(9,13,23)-(33)
(9,33)-(9,13,23)-(33)
Disadvantage: Implementation of the PVS takes a lot of time (here each table Q j acts as one block).

4/23/21 6
Full Outer Join (FOJ) of Blocks

(2)

Q j ,i - i -th block of table Q j . The amount is taken over all combinations of blocks

Example (query and tables see above): Q1.A1= Q2.A1 and Q1.A2>= Q2.A2 and Q2.A3!= Q3.A3;

The selected areas of the table Q j are

its blocks
- these are PVA blocks Q j ,i
С(Q1,1, Q2,1,Q3,1)=2, С(Q1,1, Q2,1,Q3,2)=0,
С(Q1,1, Q2,2,Q3,1)=0, С(Q1,1, Q2,2,Q3,2)=0,
С(Q1,2, Q2,1,Q3,1)=2, С(Q1,2, Q2,1,Q3,2)=0,
С(Q1,2, Q2,2,Q3,1)=0, С(Q1,2, Q2,2,Q3,2)=0,
С(Q1,3, Q2,1,Q3,1)=0, С(Q1,3, Q2,1,Q3,2)=0,
С(Q1,3, Q2,2,Q3,1)=0, С(Q1,3, Q2,2,Q3,2)=2.
C(Q)=6
Disadvantage: Estimation of cardinality of connection of one combination of blocks (Q1,i1 ,..., Qm,im )
is executed quickly (the blocks are small), but the number of combinations (Q1,i1 ,..., Qm,im )
may be very large

4/23/21
7
Proposed Method for Estimating the Cardinality of
Tables (Q 1 , Q 2 , ..., Q m ) ( EVACAR)

To reduce the amount of calculations, we will use the theory of approximate calculation of
aggregates.
1. With probability  g , select a combination of blocks g =( i 1 ,..., i m ):
m
 g =  (1/ N j ) , N j - number of blocks in table Q (3)
j =1
2. We’ll make it j . spicy for her. full
outer join: FOJ𝑔 = FOJ(𝑄1,𝑖1 , . . . , 𝑄𝑚,𝑖𝑚 ).
3. For this PVA g , we calculate the cardinality cg = c(Q1,i1 ,..., Qm,im ) according to formula (1).

4. Samples g are repeated n times. Next, we evaluate the cardinality using the
formula: 1 cg N m
c(Q, n) =  ( ) =  cg , N =  N j (4)
n g g n g j =1
Evaluation properties (4):
1. c(Q, n) ⎯⎯⎯
n →
→ c(Q) and n( E c(Q, n) = c(Q)) , i.e. estimate (4) is unbiased for any n.
2. Property 1 is true for any probability distribution {  g } : cg  0 → π g  0

4/23/21
8
Theoretical Assessment of the Accuracy of
Cardinality Calculations

Confidence interval of estimate c (Q, n) (relative value):

| c(Q ) − c(Q, n) | N 1
=  tn −1, ( 2  cg2 − 1)  (5)
c(Q ) с (Q ) g (n − 1)
c ( Q ) – true value of cardinality; for n >121 coefficient t n -1,  practically does not depend on
n , and for  =0.9; 0.95; 0.99 it is equal to 1.645; 1.960; 2,576.

To simplify the analysis of formula (5), we assume that the cardinality value c ( Q ) is uniformly
distributed over K combinations (chains) g =( i 1 , ..., i m ). That is, with g = c ( Q )/ K , |{ g }|= K .
Then we get

N 1
  tn −1, ( − 1)  (6)
K (n − 1)
Conclusio. The larger K , that is, the number of combinations g with non-empty block joins, the
smaller the relative error  .
4/23/21
9
Implementation of the EVACAR Method
(Prototype)
Reading records from source tables:

a) query execution stage b) plan building stage

Host - windows

Virtual Machine Virtual machine (VM) with Ubuntu 18.04.5 OS, 1 Intel
Client DB Server Core i5 CPU, 4GB RAM, 20GB disk. The EVACAR
program is implemented in C language (gcc compiler),
Test PostgreSQL Test DB the program size is 40 KB. STATS Test Dataset.
EVACAR
(Q57) 15 (STATS)

Qj blocks read by cursor (libpq)

4/23/21
10
Test Description
Testing on the STATS dataset, specifically Query Q57 – most representative
used to analyze CardEst methods. • where 6 tables are joined and search
Complex properties: conditions are applied to them (we
• large number of attributes will call them subqueries)
• strong distributed skewness
• high attribute correlation
• complex table join scheme SELECT COUNT(*)
FROM users as u, badges as b, postHistory as ph ,
votes as v, posts as p, postLinks as pl
WHERE p.Id = pl.RelatedPostId AND u.Id =
p.OwnerUserId AND u.Id = b.UserId AND u.Id =
Execution ph.UserId AND u.Id = v.UserId
• run on a VM in the Postgresql AND p.CommentCount >=0 AND p.CommentCount
<=13
environment 15 about 17 mins AND ph.PostHistoryTypeId =5 AND ph.CreationDate
• ~7 mins. of pure virtual machine time <='2014-08-13 09:20:10'::timestamp
• query result: 17,849,233,970 records AND v.CreationDate >='2010-07-19
00:00:00'::timestamp
AND b.Date <='2014-09-09 10:24:35'::timestamp
AND u.Views >=0 AND u.DownVotes >=0 AND
u.CreationDate >='2010-08-04
16:59:53'::timestamp AND u.CreationDate <='2014-
07-22 15:15 :22'::timestamp;

3/8/17 11
Comparison with BayesCard , DeepDB and FLAT Methods in
Terms of Accuracy

EVACAR : product of the number of

blocks N j was equal to N = 10 5 , the
number of samples g was equal to n
=10.

Sampling for BayesCard , DeepDB and

FLAT - 1,
Sample for EVACAR – 50 (for each
estimate c( Q,n ) ).

EVACAR is better (yellow):

BayesCard - for 13% of subplans ,
DeepDB - for 38% of subplans ,
FLAT – for 42% of subplans .

EVACAR is worse (green):

in 1 case.

3/8/17 12
Comparison with BayesCard , DeepDB and FLAT Methods for
Performance and Memory

BayesCard , DeepDB and FLAT :

Two different Linux servers. One, with 32 Intel ( R ) Xeon ( R ) Platinum 8163 processors clocked at 2.50
GHz, Tesla V 100 SXM 2 GPU , and 64 GB RAM, was used to train the models. Another, with 64 Intel
Xeon E 5-2682 processors clocked at 2.50 GHz, was used for cardinality estimation on PostgreSQL.
EVACAR :
One virtual machine (VM) Ubuntu 18.04.5 with 1 Intel Core i 5 CPU with a frequency of 1.6 GHz and
4GB RAM.

Space-time characteristics of compared cardinality estimation methods

BayesCard DeepDB Flat EVACAR

Average cardinality estimation 2200/24=92

5.8 87 175
time per subplan request , ms. 90/24=3.8

Model size, MB. 5.9 162 310 13.1 (7.1) for n=10
Training time, min. 1.8 108 262 not required
Model update time when inserting
12 248 360 not required
10 6 records into the database, s.
EVACAR : time measurement - gprof program , memory measurement - program valgrind .
3/8/17 13
Advantages of the EVACAR Method
Advantage

Mathematical basis: the property of a complete outer join, the theory of sum estimation based on sampling

No need to train and retrain the model, unlike in methods based on machine learning

Condition for joining tables can be arbitrary (theta join), that is, it is not necessarily equality of attributes

No problems with assessing the selectivity of the source tables, since records are joined after executing
subqueries (slide 10)
No access database indexes, so their presence is not required

No assumptions about the independence of attributes and the uniform distribution of records across
domain values (unlike the classical approach)
Accuracy of cardinality estimation and the running time of the algorithm are regulated by the number of
samples n and the product of the number of blocks Nj
Q j , i blocks are small in size, so their complete external connection is performed quickly

3/8/17 14
Advantages of the EVACAR method
(continued)
The cardinality of each subplan is estimated based on the sample for the original query Q , that is, no
additional costs are required.

One tree of blocks and structures is built for the entire query, and then its subtrees are used to
evaluate the cardinality of any subplan . Therefore, evaluation of subplans is performed quickly.

3/8/17 15
Advantages of the EVACAR method
(continued)
The connection graph of query tables may contain cycles.
Example:
select * from A,B where A.a1=B.b1 and A.a2> (select avg(C.c3) from C where B.b2=C.c1 and
C.c2=A.a3);
The connection conditions ( A . a 1= B . b 1) - ( B . b 2= C . c 1) - ( C . c 2= A . a 3 ) form a cyclic
graph A - B - C - A. In addition, an additional condition ( >) is applied to the Aa 2 attribute .

Structures with
record numbers of iB iC
child blocks (to iC
obtain chains iA, iB, iA iB c(i ) = (1A,i = 1)  (1B ,i = 1)  (1C ,i = 1)  1a 3= c 2 and a 2  c31, i
iC)
A C

Additional blocks
with table attributes iA iC

a2 a3 c2 c31=
Additional filtration a3=c2 and avgc1,c2(c3)
conditions a2>c31

3/8/17 16
Disadvantages of the Method and Directions
for Further Research
N 1
  tn −1, ( − 1)  ( see (6))
K (n − 1)
m
N =  N j - - product of the number of blocks of tables Q j ,
j =1
K - number of combinations of blocks g =( i 1 , ..., i m ) with non-empty
connections,
n - sample size of combinations g .
Flaw . The more N and the smaller K , the greater the error in cardinality estimation for a fixed n .
Solutions
1. Increasing sample size n due to parallelization.
1 cg N
c(Q, n) =  ( ) =  cg ( see (4))
n g g n g
The calculation of the sum can be parallelized across several cores. If n increase from 10 to 100 (i.e.
use 10 cores), then the confidence interval (95%) of the q -error for query Q 57 decreases from ( -
9.8  4.4) (see subplot 24 on slide 12) up to (-2.1  1.6). Those. almost 4 times.
2. The accuracy of the estimate c ( Q , n ) significantly depends on the probability distribution {  g
}. If  g  with g / c ( Q ), then the error in cardinality calculation will be minimal.
3/8/17 17
Thank you!

Uriy Grigorev, Olga Pluzhnikova , Evgeny Detkov

Bauman Moscow State Technical University, BMSTU
Moscow, Russia
e-mail: [email protected], [email protected],
[email protected]

Andrey Ploutenko , Aleksey Burdakov

Amur State University, AmSU
Blagoveschensk , Russia
e-mail: [email protected], [email protected]

3/8/17 18

Oracle SQL Performance Tuning and Optimization It's All About The Cardinalities
100% (3)
Oracle SQL Performance Tuning and Optimization It's All About The Cardinalities
87 pages
John Salminen Master of The Urban Landscape From Realism To Abstractions in Watercolor Download Instantly
No ratings yet
John Salminen Master of The Urban Landscape From Realism To Abstractions in Watercolor Download Instantly
331 pages
Applying The Concepts of Artificial Intelligence in Electrical Installation Inspection (Research)
No ratings yet
Applying The Concepts of Artificial Intelligence in Electrical Installation Inspection (Research)
4 pages
Proposal Defense Questions
No ratings yet
Proposal Defense Questions
2 pages
Executive Functions Adele
0% (1)
Executive Functions Adele
16 pages
TDS-Anti Ozone Softener 4705N (KYS-778)
No ratings yet
TDS-Anti Ozone Softener 4705N (KYS-778)
3 pages
High Court of Judicature at Madras: Thursday 1 April 2021 Index
No ratings yet
High Court of Judicature at Madras: Thursday 1 April 2021 Index
373 pages
Quiz App Report Chapters
No ratings yet
Quiz App Report Chapters
21 pages
CO ELEC2133 1 2024 Term2 T2 InPerson Standard Kensington
No ratings yet
CO ELEC2133 1 2024 Term2 T2 InPerson Standard Kensington
23 pages
X Press Ions Fire Retardant
No ratings yet
X Press Ions Fire Retardant
3 pages
BSED Sci - EL2 Form1 - CFJNM
No ratings yet
BSED Sci - EL2 Form1 - CFJNM
4 pages
10 Qo343435154tertweretwgstwgw4
No ratings yet
10 Qo343435154tertweretwgstwgw4
46 pages
Summary of Think and Grow Rich by Napoleon Hill
100% (1)
Summary of Think and Grow Rich by Napoleon Hill
5 pages
I Like You So Much Ang You'll Know It
No ratings yet
I Like You So Much Ang You'll Know It
2 pages
Midterm Exam in Tle 9 Cookery 4th Q
No ratings yet
Midterm Exam in Tle 9 Cookery 4th Q
4 pages
CSE 444: Database Internals: Section 4: Query Optimizer
No ratings yet
CSE 444: Database Internals: Section 4: Query Optimizer
16 pages
A Linear-Time Probabilistic Counting Algorithm For Database Applications
No ratings yet
A Linear-Time Probabilistic Counting Algorithm For Database Applications
22 pages
Query Optimization Part1
No ratings yet
Query Optimization Part1
52 pages
11 Query Evaluations
No ratings yet
11 Query Evaluations
17 pages
DBMS Unit5 Lecture1
No ratings yet
DBMS Unit5 Lecture1
22 pages
DB - Lecture Query Optimization
No ratings yet
DB - Lecture Query Optimization
80 pages
Friction
No ratings yet
Friction
6 pages
FS Analysis
No ratings yet
FS Analysis
46 pages
SD - User's Manual SO
No ratings yet
SD - User's Manual SO
12 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
Smooth Scan - Robust Access Path Selection Without Cardinality Estimation
No ratings yet
Smooth Scan - Robust Access Path Selection Without Cardinality Estimation
25 pages
Enrollment
No ratings yet
Enrollment
38 pages
Vu Lec 33
No ratings yet
Vu Lec 33
36 pages
Cross Cultural Management
No ratings yet
Cross Cultural Management
15 pages
Cost Estimation For Query Optimization
No ratings yet
Cost Estimation For Query Optimization
14 pages
UNIT 4 Query Processing and Different Types of Databases
No ratings yet
UNIT 4 Query Processing and Different Types of Databases
13 pages
Example - Transformed Sections Fig. 1: A y A y
No ratings yet
Example - Transformed Sections Fig. 1: A y A y
1 page
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Crea 500 S - en - Ru - CZ - SK - B1
No ratings yet
Crea 500 S - en - Ru - CZ - SK - B1
5 pages
Sample-Based Cardinality Estimation in Full Outer Join Queries
No ratings yet
Sample-Based Cardinality Estimation in Full Outer Join Queries
10 pages
Sample-Based Cardinality Estimation in Full Outer Join Queries
No ratings yet
Sample-Based Cardinality Estimation in Full Outer Join Queries
9 pages
Questions Mechanics
No ratings yet
Questions Mechanics
2 pages
Air Conditioning Service PDF
No ratings yet
Air Conditioning Service PDF
1 page
Kjr-20-1368 - A Stepwise Diagnostic Approach To Cystic Lung Diseases
No ratings yet
Kjr-20-1368 - A Stepwise Diagnostic Approach To Cystic Lung Diseases
13 pages
Sqldev320a Week10-1
No ratings yet
Sqldev320a Week10-1
41 pages
Reco
No ratings yet
Reco
18 pages
Join Cardinality Estimation Methods
100% (1)
Join Cardinality Estimation Methods
35 pages
CH 11
No ratings yet
CH 11
19 pages
Patent Trolling in India
No ratings yet
Patent Trolling in India
3 pages
Knots Ribbon La
No ratings yet
Knots Ribbon La
3 pages
Physical Examinations Respiratory System: Inspection
No ratings yet
Physical Examinations Respiratory System: Inspection
5 pages
Mathematics 11 01383
No ratings yet
Mathematics 11 01383
18 pages
Query Optimization
No ratings yet
Query Optimization
20 pages
Database Modeling - notes-VI
No ratings yet
Database Modeling - notes-VI
8 pages
Kragujevac Massacre 1941
No ratings yet
Kragujevac Massacre 1941
13 pages
Thesis On Query Optimization in Distributed Database
100% (1)
Thesis On Query Optimization in Distributed Database
6 pages
Q Optimizer
No ratings yet
Q Optimizer
15 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages
37-Module-4 Query Optimization-16-03-2024
No ratings yet
37-Module-4 Query Optimization-16-03-2024
26 pages
Lesson 06
No ratings yet
Lesson 06
44 pages
Dbms Query Evaluation
No ratings yet
Dbms Query Evaluation
28 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
20 Cost Based Optimization Annotated
No ratings yet
20 Cost Based Optimization Annotated
52 pages
Plan Cost
No ratings yet
Plan Cost
37 pages
7 QueryProcessing
No ratings yet
7 QueryProcessing
42 pages
Chapter 1 Part II
No ratings yet
Chapter 1 Part II
22 pages
ADB Slides 4
No ratings yet
ADB Slides 4
47 pages
Advanced Database
No ratings yet
Advanced Database
47 pages
Query Processing
No ratings yet
Query Processing
39 pages
RM#29 July 2009
No ratings yet
RM#29 July 2009
12 pages
Heuristic-Based Query Optimization
No ratings yet
Heuristic-Based Query Optimization
6 pages
1.6 PPT - Query Optimization
No ratings yet
1.6 PPT - Query Optimization
53 pages
Elmasri/Navathe, Fundamentals of D Atabase Systems, 4th Edition
No ratings yet
Elmasri/Navathe, Fundamentals of D Atabase Systems, 4th Edition
29 pages
A Year of Crochet Stitches: A Stitch-a-Day Perpetual Calendar
From Everand
A Year of Crochet Stitches: A Stitch-a-Day Perpetual Calendar
Jill Wright
No ratings yet
Cs410 Notes Ch15
No ratings yet
Cs410 Notes Ch15
20 pages
Fundamentals of Database Systems: (Query Optimization - I)
No ratings yet
Fundamentals of Database Systems: (Query Optimization - I)
27 pages
Calculating Selectivity: Whoami?
No ratings yet
Calculating Selectivity: Whoami?
15 pages
10987C ENU PowerPoint Day 3
No ratings yet
10987C ENU PowerPoint Day 3
125 pages
The VC-Dimension of SQL Queries and Selectivity Estimation Through Sampling
No ratings yet
The VC-Dimension of SQL Queries and Selectivity Estimation Through Sampling
20 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
You're Smarter Than A Database: Overcoming The Optimizer's Bad Cardinality Estimates
No ratings yet
You're Smarter Than A Database: Overcoming The Optimizer's Bad Cardinality Estimates
53 pages
Optimizing Your Query Plans With The SQL Server 2014 Cardinality Estimator
No ratings yet
Optimizing Your Query Plans With The SQL Server 2014 Cardinality Estimator
43 pages
Performance Comparison of Fuzzy Queries On Fuzzy Database and Classical Database
No ratings yet
Performance Comparison of Fuzzy Queries On Fuzzy Database and Classical Database
9 pages
13 QP1
No ratings yet
13 QP1
33 pages
Oracle Histograms and Why
No ratings yet
Oracle Histograms and Why
8 pages
Ulllted States Patent (10) Patent N0.: US 8,549,004 B2
No ratings yet
Ulllted States Patent (10) Patent N0.: US 8,549,004 B2
12 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
ADBMSPPT
No ratings yet
ADBMSPPT
12 pages
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
No ratings yet
How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk
37 pages
Analytic Geometry: Graphic Solutions Using Matlab Language
From Everand
Analytic Geometry: Graphic Solutions Using Matlab Language
Ing. Mario Castillo
No ratings yet
Histograms - Myths and Facts
No ratings yet
Histograms - Myths and Facts
14 pages
Sat Mathematics Review And Practice
From Everand
Sat Mathematics Review And Practice
Addison Shaw
1/5 (1)

IoTDBS 2024 Presentation

Uploaded by

IoTDBS 2024 Presentation

Uploaded by

Sample-Based

Uriy Grigorev, Olga Pluzhnikova , Evgeny Detkov

Andrey Ploutenko , Aleksey Burdakov

SELECT attributes, aggregates

Requires indexes on foreign keys of connections sample-based

Only table joins based on attribute equality are considered all

The tables to be joined must form an acyclic graph ML-based

The justification of the methods is given at the level of heuristics all

STATS test , query Q 57 (see below), subquery {1,5,3,4,2},

The selected areas of the table Q j are

Confidence interval of estimate c (Q, n) (relative value):

a) query execution stage b) plan building stage

Qj blocks read by cursor (libpq)

EVACAR : product of the number of

Sampling for BayesCard , DeepDB and

EVACAR is better (yellow):

EVACAR is worse (green):

BayesCard , DeepDB and FLAT :

Space-time characteristics of compared cardinality estimation methods

Average cardinality estimation 2200/24=92

Uriy Grigorev, Olga Pluzhnikova , Evgeny Detkov

Andrey Ploutenko , Aleksey Burdakov

You might also like