Infobright ICE User Guide
Infobright ICE User Guide
2
GA
USER GUIDE
WWW.INFOBRIGHT.COM
Copyright Notice
The materials provided herein are Copyright © 2005-2010 Infobright Inc.
All rights reserved.
CONFIDENTIAL: The information contained in this document is the property of Infobright Inc. Except a s
specifically authorized in writing by Infobright, the holder of this document sh all keep the information contained
herein confidential and sh all protect s ame in whole or in part from disclosure or dis semination to third parties.
If these materials were purcha sed a s a digital download, Infobright hereby grants the purcha ser permission to
reproduce a single copy (print or download) of the materials without prior written permission.
If these materials were purcha sed in printed form, no part of these materials sh all be reproduced or retransmitted
by any means, electronic, mechanical, photocopying, recording, or otherwise without written permis sion from
Infobright.
Document Revision 3.4.2 GA-10.07.28
C ONT ENT S
1. About Infobright................................................................................................. 1
Infobright Overview............................................................................................................................................. 1
Infobright and MySQL......................................................................................................................................... 1
2. Setting up Infobright.......................................................................................... 3
Technical Requirements....................................................................................................................................... 3
Linux for Infobright.............................................................................................................................................. 4
Installing Infobright.............................................................................................................................................. 4
Windows Installation Instructions.............................................................................................................. 4
Linux Installation Instructions..................................................................................................................... 5
RPM and DPKG Install.......................................................................................................................... 5
TAR Install............................................................................................................................................... 6
Windows Upgrade Instructions...................................................................................................................7
Linux Upgrade Instructions......................................................................................................................... 8
RPM or DPKG Upgrade........................................................................................................................ 8
TAR Upgrade.......................................................................................................................................... 9
Confguring Infobright....................................................................................................................................... 10
Confguration Tips and Examples............................................................................................................. 13
3. Using Infobright............................................................................................... 14
Starting and Stopping the Infobright Server................................................................................................... 14
Windows................................................................................................................................................ 14
Linux and Solaris.................................................................................................................................. 14
Linux....................................................................................................................................................... 14
Working with the Infobright Server................................................................................................................. 15
Windows................................................................................................................................................ 15
Linux and Solaris.................................................................................................................................. 15
Linux....................................................................................................................................................... 15
Checking the Infobright Version....................................................................................................................... 16
Infobright is the default Storage Engine.......................................................................................................... 17
About Log Files................................................................................................................................................... 17
About Errors........................................................................................................................................................ 18
About SQL Command Syntax........................................................................................................................... 18
About SQL ISO Standards................................................................................................................................. 18
1. About Infobright
Infobright Overview
Thank you for choosing to install Infobright Community Edition (ICE) 3.4.2 GA. Infobright is
a column-oriented, high performance analytic engine designed for analytic applications and
data marts that need fast query response across large data volumes. Infobright was designed
specifcally for large volume data analytics applications with up to 50TB of data.
Infobright uses a unique and patent-pending approach to compressing, storing, and
processing data that allows it to be installed and run on commodity hardware with little or
no DBA intervention. Infobright requires little tuning to support ad hoc or complex business
analytic queries.
Infobright is a database engine utilizing the MySQL database environment. As such,
Infobright is fully compatible with all MySQL-compliant Business Intelligence tools and
utilizes the MySQL administrative interface to reduce the learning curve for system
administrators.
Infobright Community Edition provides a versatile, highly-compressed database system
optimized for analytic-type queries. The ratio of possible compression and the speed of data
import and retrieval are optimized at the expense of some transactional features of the engine
performance, like the frequent data updating.
Infobright executes complex or ad hoc queries across vast amounts of data with a low cost of
ownership.
Since other storage engines, like InnoDB and Falcon, are not included in the Infobright
distribution, they must be run as separate instances (executables). If you wish to combine
other storage engines with Infobright, you will need to look at a database federation
application (some BI tools provide this).
2. Setting up Infobright
Technical Requirements
Before installing Infobright, review the following technical requirements.
Important: 32-bit platforms are for solution testing purposes only, and not recommended for
Installing Infobright
The Infobright installation packages are provided as an RPM, DEB, PKG, .exe, or tarball. For
non-Windows platforms, the user installing Infobright must be the root user or a user with
the necessary permissions to install fles, create the user mysql and create the group mysql.
Important: Do not install in the root or home directories due to possible MySQL permission
checking issues during install, start up, and/or load.
If you use the rpm --prefx option, you should manually create a softlink to the
Infobright install directory from /usr/local/infobright .
3. To change the default install options, after installation run:
/usr/local/infobright/postconfig.sh
You can run this script at any time after installation to change the datadir, CacheFolder,
socket, or port. The script must be run as root, and Infobright must not be running.
Datadir Path to the directory where tables will be created and stored. Use
a high-performance storage such as a RAID.
Cachedir Path to the directory where temporary fles will be created and
stored. Should be located on a fast drive, possibly not the same as
the data. Allow at least 100 GB of free space (depending on
database size).
4. The installation determines the optimum memory settings based on the physical memory
of the system. You may change these settings by editing the fle brighthouse.ini within the
data directory.
Important: The memory settings assume that there are no other services on the machine
consuming signifcant memory. If this is not the case, please lower the memory
settings for Infobright.
INFOBRIGHT 3.4.2 USER G UIDE
2 SETTING UP INFOBRIGHT 6
rpm -e infobright
or
dpkg -r infobright
TAR Install
To install Infobright on Linux using the tarball package:
1. Obtain root user access
2. Change to the parent location in which you want to install (e.g. /usr/local)
cd /usr/local
Important: Do not install in the root or home directories due to possible MySQL permission
checking issues during install, start up, and/or load.
3. Unpack the tarball, which will create the product directory (e.g. infobright-3.4.2-p1-
x86_64_ice and create a symbolic link ‘infobright’ to the product folder
4. Run the install script with the “--help” fag to check for system confguration and provide
examples of directory parameters
./install-infobright.sh –help
Parameters required:
Datadir Path to the directory where tables will be created and stored. Use
a high-performance storage such as a RAID.
Cachedir Path to the directory where temporary fles will be created and
stored. Should be located on a fast drive, possibly not the same as
the data. Allow at least 100 GB of free space (depending on
database size).
User System user who can run the Infobright server instance. User will
be created if it does not exist. The default user is mysql.
Group System group for the above user. Group will be created if it does
not exist. The default group is mysql.
5. Run the install script again, this time with directory parameters. If parameters are used
that already exist, an error will occur (for example running the same script with
parameters twice)
Example command
./install-infobright.sh --datadir=/usr/local/infobright/data
--cachedir=/usr/local/infobright/cache --port=5029 --config=/etc/my-ib.cnf
--socket=/tmp/mysql-ib.sock --user=mysql --group=mysql
6. Change the default memory confguration by editing the fle brighthouse.ini within the
data directory. See “Recommended Memory Confgurations” later in this chapter.
Important: It is critical that you increase the memory settings for systems running more than
2GB of physical memory or performance will be severely impacted.
Important: The MySQL Upgrade utility may display several errors regarding the use of locks
with log tables and errors requiring table upgrades. The errors are all handled
automatically by Infobright and/or the upgrade utility and can be ignored.
7. Stop and start the Infobright server from the Start Menu items.
Important: The MySQL Upgrade utility may display several errors regarding the use of locks
with log tables and errors requiring table upgrades. The errors are all handled
automatically by Infobright and/or the upgrade utility and can be ignored.
/usr/local/infobright/bin/mysqld --version
TAR Upgrade
To upgrade Infobright on Linux using the tarball package:
1. Unpack the tarball into a temporary folder. Use the gunzip utility for unpacking:
cd /path/to/temp/
gunzip < /path/to/infobright-3.4.2-x86_64.tar.gz | tar xvf -
3. Run the install script with the "--upgrade” and “--confg” fags and pass in the
confguration fles of the previously installed version:
./install-infobright.sh --upgrade --config=/etc/my-ib.cnf
Important: The MySQL Upgrade utility may display several errors regarding the use of locks
with log tables and errors requiring table upgrades. The errors are all handled
automatically by Infobright and/or the upgrade utility and can be ignored.
Configuring Infobright
The Infobright confguration fle is called brighthouse.ini and is located in the data
subdirectory within your Infobright installation directory. The confguration fle is a text fle
containing the Infobright confguration parameters. See the Infobright installation package
for a sample brighthouse.ini fle.
Important: It is critical that you specify increased memory settings for systems running more
than 2GB of physical memory to ensure optimal performance.
Each parameter is shown on a separate line and uses the following form:
ParameterName=ParameterValue
If a parameter is not present in the confguration fle or if the confguration fle does not exist,
the default values are used. Blank lines and comments (lines starting with #) are ignored.
Be sure to customize the following parameters to optimize performance. These parameters
are case-sensitive and must be typed as shown.
ServerMainHeapSize=size Not less than 320 Size of the main memory heap in
Default: 600 the server process, in MB. The
larger the heap size, the more
effectively the server works.
However, the sum of the heap
sizes in the server and the loader
should not exceed physical
memory installed in the machine,
otherwise performance decreases
radically.
Note: The values are commented out (preceded by #) in the brighthouse.ini fle which
causes them to default to the application minimum allowed values of 600 and 320 for
ServerMainHeapSize and LoaderMainHeapSize respectively.
In most cases, the loader does not beneft from larger memory settings. However, increasing
the LoaderMainHeapSize can help when:
• a table to be loaded has very long text values, or
• the table has many columns (e.g., 1000 columns).
You can use more memory at import if you are planning to execute several concurrent load
tasks to different data tables. However, disk access may become a bottleneck.
ServerMainHeapSize should be as large as possible but safely smaller than the amount of
physical memory in the machine. If performance decreases because of memory swapping by
the operating system, try to set lower heap sizes. We also recommend decreasing the heap
size if many users are running queries in parallel.
Important: Infobright may use additional memory for heavy loads or queries. Also, other
applications on your server will use memory for their processes. It is important
that the total of ServerMainHeapSize and LoaderMainHeapSize is less than the
total available physical memory. If the system needs to swap memory,
performance will be severely impacted.
3. Using Infobright
Starting and Stopping the Infobright Server
Windows
The Windows Install Wizard automatically creates Infobright as a Windows Service, which
allows the Infobright server to be started and stopped automatically when you boot or
shutdown Windows.
To manually start the Infobright server, from the Windows Start Menu run:
To manually stop the Infobright server, from the Windows Start Menu run:
Linux
You can start and stop the Infobright server the same way you would start and stop the
original MySQL server (mysqld). Before using the Infobright server, see “Starting and
Stopping MySQL Automatically” in the MySQL 5.1 Reference Manual.
Important: It is recommended that you run Infobright using MySQL user credentials rather
than root for security reasons.
To start the Infobright server on Linux, run:
/etc/init.d/mysqld-ib start
/etc/init.d/mysqld-ib start
To start/stop the Infobright server during system boot/shutdown use the mysqld-ib
script in /etc/init.d/ for start and stop services. Use run level 2 3 4 5 to start the service,
and run level 0 1 6 to stop.
The following are sample commands to create services:
Windows
To connect to the Infobright command line interface, run :
Linux
If you used the standard install locations, enter the following command to connect to
Infobright:
/usr/bin/mysql-ib
If you used a different install location, modify the above command to point to your socket
fle.
When the Infobright server is frst installed, an administrator account with no password is
created. To connect to the administrator account, use the following command:
mysql-ib
To run a script when connecting to the administrator account, use the following
command:
For example:
mysql-ib < /tmp/testing/input.txt
To run a script when connecting to the administrator account and direct all output to a
text fle, use the following command:
mysql-ib < input_script_name.txt > output_results.txt
For example:
mysql-ib < /tmp/testing/input.txt > /tmp/testing/output.txt
During the Infobright server shutdown process, the server will not shut down until all
running commands are completed.
To force the shutdown of the server:
Kill the mysqld process and all running bhloader processes.
Infobright can be used with most Business Intelligence tools and any MySQL GUI client tool
like Toad or Navicat. Simply point to the IP address and socket number for the Infobright
server, and logon using any user credentials that have been set up.
After connecting to the Infobright administrator account, enter the following command at
the mysql command prompt:
mysql> show variables like "version_comment";
+-----------------+-------------------------------------------------+
| Variable_name | Value |
+-----------------+-------------------------------------------------+
| version_comment | build number (revision)=IB_3.4.2_r5IB_3.2_GA_5316 |
+-----------------+-------------------------------------------------+
1 row in set (0.00 sec)
Error log Errors starting, stopping and running the Infobright server
(mysqld). To generate this log, add the following lines to my.cnf:
log-error=<flename>
log-output=FILE
General query log Connection and statement information received from clients.
Infobright log Server start and stop information. Also contains missing
confguration settings.
About Errors
Infobright reports the same errors as the standard MySQL server. For more information, see
“Appendix B. Errors, Error Codes, and Common Problems” in the MySQL 5.1 Reference
Manual.
There are a few additional errors specifc to Infobright import and export commands. For
more information, see “About Import Errors” and “About Export Errors” in Chapter 7.
collation approaches, thus displaying inconsistent results for such things as sorts.
BH_RSI_Repository
Infobright.log
Infobright.seq
ib03.corp.infobright.com.err
mysql
test
NUMERIC TYPES
STRING TYPES
CHAR(N) 255
VARCHAR(N) 65532
BINARY(N) 255
VARBINARY(N) 65532
TINYTEXT 255
TEXT(N) 65535
See the section below, “About Column Options,” for information on supported and
unsupported options when creating columns.
Note: When creating a table, as a matter of practice one should always use the ENGINE=
option to ensure that the correct database engine is used. Infobright is shipped with
DEFAULT ENGINE = BRIGHTHOUSE, but this can be changed. The name of the
engine can be specifed explicitly at the end of create table statement:
Lookup C olumn s
Infobright provides an additional modifer for string data type columns, called a lookup
column. The lookup column utilizes an integer substitution for values. You can declare a
lookup column on a CHAR or VARCHAR column to increase its compression and performance in
queries. However, to use a lookup column, the CHAR or VARCHAR column must meet the
following criteria:
• The column must have less than 10,000 distinct values.
• The column must contain a large number of duplicate values: the ratio of total number of
records to distinct values should be greater than 10.
Typically, a lookup column is useful for felds like state, gender, category, and the like where
the number of instances is very high, but the number of unique values is very low. To
determine the ratio of records to distinct values, determine the number of distinct values
using SELECT COUNT (DISTINCT <COLUMN>) FROM… Then compare this to the
number of records using a SELECT COUNT(<COLUMN>) FROM…
Note: Using a lookup on a column where there are more than 10,000 distinct values will
result in greatly reduced load speeds.
To declare a column as a lookup column, add the comment 'lookup' on the column. Enter
the following command:
For more information, see “SHOW COLUMNS Syntax” in the MySQL 5.1 Reference
Manual.
Utilization of the FULL option will provide an estimate of the compression for each
column
+------------+---------------+-------------------+------+-----+---------+-------+-------------------+---------------------------------------+
+------------+---------------+-------------------+------+-----+---------+-------+-------------------+---------------------------------------+
| make_id | decimal(10,0) | NULL | YES | | NULL | | select,references | Size[MB]: 0.1; Ratio: 15.64; unique |
| make_name | varchar(25) | latin1_swedish_ci | YES | | NULL | | select,references | Size[MB]: 0.1; Ratio: 5.05 |
| model_name | varchar(25) | latin1_swedish_ci | YES | | NULL | | select,references | Size[MB]: 0.1; Ratio: 1.38 |
| record_dt | datetime | NULL | YES | | NULL | | select,references | Size[MB]: 0.1; Ratio: 3.86 |
+------------+---------------+-------------------+------+-----+---------+-------+-------------------+---------------------------------------+
To view the CREATE TABLE statement used to create a given table, enter the following
command:
For more information, see “SHOW CREATE TABLE Syntax” in the MySQL 5.1 Reference
Manual.
For more information, see “SHOW TABLE STATUS Syntax”in the MySQL 5.1 Reference
Manual.
+----------+-------------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+--
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment |
+----------+-------------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+--
+----------+-------------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+--
---------------------+---------------------+------------+-------------------+----------+----------------+----------------------------------+
---------------------+---------------------+------------+-------------------+----------+----------------+----------------------------------+
2008-08-28 05:30:44 | 2008-04-23 14:17:13 | NULL | latin1_swedish_ci | NULL | | Overall compression ratio: 3.622 |
---------------------+---------------------+------------+-------------------+----------+----------------+----------------------------------+
A database name and a column flter can be specifed in optional clauses. For more
information, see “SHOW COLUMNS Syntax” in the MySQL 5.1 Reference Manual.
The compression statistics are provided in the column comment. In addition to the
compression information, the comment line may also contain a “unique” indicator , meaning
that the column has all unique values (except nulls).
For example:
Important: Queries that evaluate against UTF-8 character data columns will execute with
less performance than and equivalent query against ASCII character data, due to
ASCII support of Character Maps in the Knowledge Grid (see Chapter 8). UTF-8
specifc Knowledge Grid extensions will be available in an upcoming release.
of groups and their defnitions—for example, ‘aaa’ and ‘AAA’ defne different groups)
and DISTINCT results. WHERE conditions may also be affected if you are expecting a
different sorting order than the one used by Infobright.
• To simulate Infobright collation in the MySQL engine, set latin1_bin collation while
creating a table (for more information, see “Table Character Set and Collation” in the
MySQL 5.1 Reference Manual). Enter the following command:
Padding
Infobright treats padding differently than other DBMS engines. Infobright assumes literal
comparisons of text felds, including all whitespace characters. Therefore, a string containing
two spaces is different than a string containing one space or an empty (0 length) string, which
is also different than the NULL value.
The Infobright padding defnition is compatible with the SQL standard. However, most
DBMS systems have defned less restricted, customizable rules regarding text comparison.
For example, ‘abc ’ = ‘abc’ may be true in some databases but is not true in Infobright.
Note: In CHAR columns, trailing spaces are trimmed on LOAD, whereas in VARCHAR
columns values are loaded with all spaces.
You can disable AUTOCOMMIT by setting the parameter to 0 (zero) and enable AUTOCOMMIT by
setting the parameter to 1. If AUTOCOMMIT is set to 1, then when a LOAD is completed, the
transaction is automatically committed.
To commit the current transaction, enter the following command:
mysql> commit;
If you have not yet committed a LOAD DATA INFILE transaction, you can rollback the
transaction. This will restore the import tables to the state that existed before the current
transaction. Enter the following command:
mysql> rollback;
Using COMMIT and ROLLBACK makes it possible to check the load within the same session
before committing the data, as the loaded data is available (viewable) to the load session. For
instance, you could check something about the data (number of records load) before
committing.
After importing data using the LOAD DATA INFILE command, the status of the import and the
number of affected rows is shown. All uncommitted rows, including those from previous
imports, are shown; therefore, the number of affected rows may be greater than the number
of rows in the fle you just imported.
Failure Handling
If AUTOCOMMIT is disabled and the Infobright server is terminated during an import session,
the following occurs:
• Infobright does not store the rows that were loaded during the failed import operation.
• The input fle and the database fles are not harmed. To load data from the input fle,
repeat the LOAD operation.
If AUTOCOMMIT is disabled and the Infobright server is terminated after an import session is
completed successfully but is not committed, the following occurs:
• The transaction is rolled back and the imported data is lost when the server restarts.
• The input fle and the database fles are not harmed by the failed import operation (the
database is unaffected, as if the import session did not occur). To re-import the data,
repeat the LOAD operation.
If the Infobright server is terminated during an export operation to a disk fle, the following
occurs:
• A non-empty fle is saved on disk; however, the last row in the saved fle is inconsistent.
• The database fles are not harmed by the failed export operation. To export the data,
repeat the export operation.
If Infobright tries to import data from a fle created during a failed export session, the
following occurs:
• No data is inserted because the input fle consists of corrupted table rows. No new
records are added to the database fles, so no harm is done.
Es cape Characters
The Infobright Loader supports escape character defnition and usage.
Other DBMS systems may have different representations of the NULL value; for example,
MySQL only recognizes the representation \N for a NULL value. This can create issues if you
export data from Infobright and import the data into MySQL. Since MySQL will only look for
\N and will not recognize the Infobright representation of the NULL value, MySQL will change
the NULL value into the default values in numeric and string columns.
Exporting Data
To export data from an Infobright table, use the following MySQL export command:
mkfifo /pipe_test/thepipe.pipe
chmod 666 /pipe_test/thepipe.pipe
Once the pipe is set up, direct the data either by directing a fle or a process to the pipe:
rm thepipe.pipe
1 Cannot open fle or Cannot open a fle or a pipe Ensure the fle exists and the path is
pipe containing input data entered correctly
2 Wrong data or Format of data does not Ensure the data being imported is
column defnition comply with table defnition the correct data type and does not
exceed the size specifed
3 Syntax error Not used N/A
1 Cannot open fle or Can not open a fle or a Ensure the fle exists and the path is
pipe pipe for output entered correctly
2 Wrong data or Not used Ensure the data being exported is the
column defnition correct data type and does not exceed
the size specifed
3 Syntax error Not used Check the export syntax
4 Cannot connect to the Not used Ensure database exists, the correct
database path is given and Infobright is
started
5 Unknown error Unspecifed error Contact customer support
occurred
6 Wrong parameter Wrong value for one of Make sure the correct parameter is
the export parameters used (see “Setting Import and Export
Parameters”)
7 Data conversion error Not used Ensure the data is the correct column
type
USE Northwind;
Running Queries
To run queries on Infobright tables, use the following standard MySQL syntax:
mysql> select …;
The Infobright Optimizer is the primary engine used to resolve queries. While signifcant
INFOBRIGHT 3.4.2 USER G UIDE
8 RUNNING QUERIES IN INFOBRIGHT 41
additions have been made to the library of supported SQL, there are cases where the query
will still be executed by the MySQL query engine instead of the Infobright engine. In this
event, query response time tends to suffer due to the fact that the MySQL engine is row-
oriented and therefore cannot make use of the Knowledge Grid information, and in some
cases it can be too slow to be usable. For best performance, ensure your queries (and VIEWs)
contain only syntax supported by the Infobright Optimizer. For more information, see
"Appendix A - Infobright Optimizer – Supported Functions and Operators" for select syntax
supported in Infobright"
AllowMySQLQueryPath=1
If the MySQL query path is disabled, then the following message will be returned if the query
would have otherwise been directed to MySQL for processing:
The query includes syntax that is not supported by the Infobright Optimizer.
Infobright suggests either restructure the query with supported syntax, or
enable the MySQL Query Path in the brighthouse.ini file to execute the query
with reduced performance.
This will occur when functions not optimized in Infobright are used. If you get poor query
performance, you should execute the command below to identify if a query has been directed
to the MySQL query engine.
After running a query, enter the following command to view any warnings:
The following message indicates that the query was directed to MySQL for processing:
Important: When queries are executed on Infobright tables by the standard MySQL engine,
performance can be signifcantly slower than when queries are executed by
Infobright .
Terminating a Query
If you want to terminate a query executed from a client session before the query is complete,
do the following:
1. Use the show [full] processlist command to determine the query’s process ID.
2. Use the kill <id> command to terminate the query.
OR
If you are using a command-line MySQL client, you can also use Ctrl+C to terminate the
query.
CREATE
[OR REPLACE]
VIEW view_name [(column_list)]
AS select_statement
A VIEW must contain unique column names. If you select two columns with the same name
from separate tables, at least one must be aliased or the column list option must be used.
If the View’s select statement contains functionality that is not supported in the Infobright
optimizer, then the VIEW will perform sub-optimally since it will always fip over to the
MySQL query engine.
S ele ct S yntax
For more information, see “SELECT Syntax” in the MySQL 5.1 Reference Manual.
[ HAVING where_condition ]
[ ORDER BY {col_name | expr | position } [ ASC | DESC ], … ]
[ LIMIT { [ offset,] row_count | row_count OFFSET offset} ]
[ INTO OUTFILE ‘file_name’ export_options
- AS alias_name
- ORDER BY NULL ]
Join S yntax
For more information, see “JOIN Syntax” in the MySQL 5.1 Reference Manual.
Infobright supports the following JOIN syntax for the table_references part of SELECT
statements (as described in the previous section, “Select Syntax”):
table_factor:
tbl_name [ [ AS ] alias]
join_table:
table_reference [ INNER | CROSS ] JOIN table_factor [join_condition]
| table_reference STRAIGHT_JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor ON condition
| table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_condition
Join_condition:
ON conditional_expr | USING (column_list)
Union S yntax
For more information, see “UNION Syntax” in the MySQL 5.1 Reference Manual.
SELECT ….
UNION [ ALL | DISTINCT ] SELECT …
[ UNION [ ALL | DISTINCT ] SELECT … ]
Subquerie s
For more information, see “Subquery Syntax” in the MySQL 5.1 Reference Manual.
Query Performance
Due to Infobright’s column-oriented data organization and other Infobright-specifc features,
query optimization in Infobright is slightly different than in traditional DBMS approaches.
• Infobright works well with data tables containing many columns, where only necessary
columns are accessed by query (as opposed to SELECT *). The traditional approach
suggests keeping records as small as possible (e.g., using schema normalization and table
decomposition). However, in Infobright, only necessary columns are used in calculations.
Therefore, queries with many limiting conditions on many columns of the same table are
especially well optimized in Infobright.
• In traditional DBMS systems, better performance can be achieved by creating indices. In
Infobright, Knowledge Nodes are used instead of indices (Knowledge Nodes are created
automatically). To further enhance performance, you can try to infuence the data loading
procedure by keeping similar data (e.g., for similar time frames) close together. The order
in which data are loaded may infuence both compression ratio and query speed.
• Avoid using OR in queries and, if possible, use IN instead. In some cases ORs can be
translated to UNION ALL or IN, for example: “...WHERE a=1 OR a=2... “ could be replaced
by “...WHERE a IN (1,2)... ”.
• Try to replace correlated subqueries with joins and independent subqueries.
• Executing queries in steps may also help with missing function support. For instance,
execute the bulk of query in Infobright and export the data to MyISAM table. Then
execute the function query on the result set.
To optimize your query performance, avoid the following which will result in the query
being handled by the MySQL query engine:
• Using functions or type cast operators.
• Creating queries containing mixed Infobright and MySQL tables.
• Performing comparisons or arithmetical operations on two different data types (such as
numbers and text).
• Creating JOINs with the JOIN condition defned as NOT BETWEEN.
Restore Procedure
To restore the Infobright databases from a backup copy, do the following:
1. Replace the entire data directory (usually the data subdirectory in your Infobright
installation directory) with the backup copy.
2. Replace the KNFolder with the backup copy (if the KNFolder is not inside the data
directory).
Important: Do not manually modify database fles or move them from one database to
another—this may lead to data corruption and unpredictable results.
Logical Operators
NOT, ! YES (except in join conditions)
AND, && YES
OR, | | YES
XOR No (MySQL engine)
String Function s
ASCII YES
BIN YES
BIT_LENGTH YES
CHAR No (MySQL engine)
CHAR_LENGTH YES
CHARACTER_LENGTH YES
CONCAT YES
CONCAT_WS YES
CONV YES
ELT YES
EXPORT_SET YES
FIELD YES
FIND_IN_SET YES
FORMAT YES
HEX YES
INSTR YES
LCASE YES
LEFT YES
LENGTH YES
LOAD_FILE No (MySQL engine)
LOCATE YES
LOWER YES
LPAD YES
LTRIM YES
MAKE_SET YES
MID YES
OCT YES
OCTET_LENGTH YES
ORD YES
POSITION YES
QUOTE YES
REPEAT YES
REPLACE YES
REVERSE YES
RIGHT YES
RPAD YES
RTRIM YES
SOUNDEX YES
SOUNDS LIKE No (MySQL engine)
SPACE YES
SUBSTR YES
SUBSTRING YES
SUBSTRING_INDEX YES
TRIM YES
UCASE YES
UNHEX No (MySQL engine)
UPPER YES
Numeric Function s
Addition ( + ) YES
Subtraction ( - ) YES
Multiplication ( * ) YES
Division ( / ) YES
Modulo ( % ) YES
ABS YES
ACOS YES
ASIN YES
ATAN2, ATAN YES
ATAN YES
CEIL YES
CEILING YES
CONV YES
COS YES
COT YES
DEGREES YES
EXP YES
FLOOR YES
LN YES
LOG10 YES
LOG2 YES
LOG YES
MOD YES
OCT YES
PI YES
POW YES
POWER YES
RADIANS YES
RAND YES
ROUND YES
SIGN YES
SIN YES
SQRT YES
TAN YES
TRUNCATE YES
FROM_UNIXTIME YES
GET_FORMAT No (MySQL engine)
HOUR YES
LAST_DAY No (MySQL engine)
LOCALTIME YES
LOCALTIMESTAMP YES
MAKEDATE No (MySQL engine)
MAKETIME No (MySQL engine)
MICROSECOND No (MySQL engine)
MINUTE YES
MONTH YES
MONTHNAME YES
NOW YES
PERIOD_ADD YES
PERIOD_DIFF YES
QUARTER YES
SECOND YES
SEC_TO_TIME No (MySQL engine)
STR_TO_DATE No (MySQL engine)
SUBDATE YES
SUBTIME YES
SYSDATE YES
TIME YES
TIMEDIFF YES
TIMESTAMP No (MySQL engine)
TIMESTAMPADD No (MySQL engine)
TIMESTAMPDIFF No (MySQL engine)
TIME_FORMAT YES
TIME_TO_SEC No (MySQL engine)
TO_DAYS YES
UNIX_TIMESTAMP YES
UTC_DATE YES
UTC_TIME YES
UTC_TIMESTAMP No (MySQL engine)
WEEK YES
WEEKDAY No (MySQL engine)
WEEKOFYEAR No (MySQL engine)
YEAR No (MySQL engine)
YEARWEEK YES
Group By Modifiers
ROLLUP No (error signalled)
conv-map Optional Absolute path to fle If not specifed CHMT would try
with collations to use fle: chmt-binary-
conversions folder/../support-
fles/collations.txt ; if not found
there it would search for: chmt-
binary-folder/collations.txt
Log Structure
The logs detail information about every considered table found in a specifed datadir. Each
conversion fnishes with [NOT NEEDED], [PASS] or [FAILED] status.
SELINUX=disabled
S wappines s
Set low swappiness to avoid unnecessary paging. This only helps for machines with low
levels of memory (say 4GB with 3GB allocated for Infobright).
In /etc/rc.local add:
Larger Readahead
In /etc/rc.local add:
Replace sd<x> with a proper device symbol, e.g. sdc, it should be the drive(s) on which
datadir and/or CacheFolder resides
In /etc/fstab add:
Note: This is for data folders only. Linux boot partition can be ext3.
noatime
Use noatime options for mounting database and cache volumes (see below for details).
Otherwise the system will update the access time for fles and directories (which degrades
performance).
Deadline Elevator
The default scheduler - CFQ - is 1% faster than elevator for a single user. However, in multi-
user test with 4 users, elevator had 20% better performance.
In /etc/rc.local add:
Replace sd<x> with a proper device symbol, e.g. sdc, it should be the drive(s) on which
datadir and/or CacheFolder resides
# ulimit -a
To set it to a new value for this running session, which takes effect immediately, run
command:
# ulimit -n 8800
The two lines above change the max number of fle handles - nofle - to new settings.
• Save the fle.
• Login as user1 again. The new changes will be in effect.