Business Intelligence: Explain The Architecture of SQL Reporting Service
Business Intelligence: Explain The Architecture of SQL Reporting Service
1. A set of tools that can be used to create manage and view reports
2. A report server component that host and processes reports in veriety of formats.
3. An API that allows you to integrate or extend data and report processing in
application or to create custom tools to build or manage reports.
Reporting Services runs as a middle-tier server, as part of your existing server architecture.
SQL Server 2000 should be installed for the database server, and Internet Information Services
6.0 as a Web server.
The report server engine takes in report definitions, locates the corresponding data, and produces
the reports.
Interaction with the engine can be done through the Web-based Report Manager, which also lets
you manage refresh schedules and notifications.
End users view the report in a Web browser, and can export it to PDF, XML, or Excel.
SQL Server’s Reporting services offer a variety of interactive and printed reports managed by a
web interface. Reporting services is a server based environment.
Report manager is a web application. In SSRS it is accessed by a URL. The interface of this
Report manager depends on the permissions of the user. This means to access any functionality
or perform any task, the user must be assigned a role. A user with a role of full permissions can
entire all the features and menus of the report. To configure the report manager, a URL needs to
be defined.
What is an index?
Indexes of SQL Server are similar to the indexes in books. They help SQL Server retrieve the
data quicker. Indexes are of two types. Clustered indexes and non-clustered indexes. Rows in the
table are stored in the order of the clustered index key.
There can be only one clustered index per table.
Non-clustered indexes have their own storage separate from the table data storage.
Non-clustered indexes are stored as B-tree structures.
Leaf level nodes having the index key and its row locater
Non-clustered is the index in which logical order doesn’t match with physical order of stored data
on disk.
Non-clustered index contains index key to the table records in the leaf level.
There can be one or more Non-clustered indexes in a table.
Answer
A clustered index reorders the way records are stored. A non clustered index is in which the
logical order of the index does not match the physical stored order of the rows on disk. A
clustered index is must faster because the index entries are actually data records. There can be
just one clustered index per table while there can be up to 249 non clustered indexes
Both stored as B-tree structure. The leaf level of a clustered index is the actual data where as leaf
level of a non-clustered index is pointer to data.
The fill factor option is provided for smoothening index data storage and performance.
The percentage of space on each leaf level page to be filled with data is determined by the fill
factor value When an index is created. This reserves a percentage of free space for future growth
A fill factor is a specification done during the creation of indexes so that a particular amount of
space can be left on a leaf level page to decrease the occurrence of page splits when the data
has to be accommodated in the future.
A pad index specifies index padding. When it is set to ON, then the free space percentage per the
fill factor specification is applied to the intermediate-level pages of the index. When it is set to
OFF, the fill factor is not specified and enough space is left for a maximum size row that an index
can have
What is the difference between UNION and UNION ALL?
UNION selects only distinct values whereas UNION ALL selects all values and not just distinct
ones.
Log shipping defines the process for automatically taking back up of the database and transaction
files on a SQL Server and then restoring them on a standby/backup server. This keeps the two
SQL Server instances in sync with each other. In case production server fails, users simply need
to be pointed to the standby/backup server. Log shipping primarily consists of 3 operations:
Temporary tables are used to allow short term use of data in SQL Server. They are of 2 types:
Local
Only available to the current Db connection for current user and are cleared when connection is
closed.
Global
Available to any connection once created. They are cleared when the last connection is closed.
What is the STUFF and how does it differ from the REPLACE function?
STUFF function is used to insert a string into another string by deleting some characters
specified.
On the other hand, REPLACE instead of replacing specific characters, replaces existing
characters of all occurrences.
Truncate
It is the fast way to remove all the records from the table.
Delete
Delete command removes records one at a time and logs into the transaction log.
It can be used with or without where clause.
The records can be rolled back.
It activates trigger.
It doesn’t reset the identity of the column.
Answer
Purposes and advantages of stored procedures:
Shared lock
Update lock
Exclusive lock
Shared lock
Update locks
Exclusive locks
This kind of lock is used with data modification operations like update, insert or delete.
Having Clause: This clause is used for specifying a search condition for a group or an aggregate.
It can only be used with a SELECT statement. It’s often used with GROUP BY clause without
which its synonymous to a WHERE clause.
Where Clause: This clause is used to narrow down the dataset being dealt with following a
condition.
It is strongly recommended to use a Where clause with every Select Statement to avoid a table
scan and reduce the number of rows to be returned. It can be used with Select, Update, Delete
etc statements.
What is NOLOCK?
NOLOCK is used to improve concurrency in a system. Using NOLOCK hint, no locks are
acquired when data is being read. It is used in select statement. This results in dirty read -
another process could be updating the data at the exact time data is being read. This may result
in users seeing the records twice.
Transaction
A transaction is a set of operations that works as a single unit. The transactions can be categorized into
explicit, auto commit, and implicit transactions. Every transaction should follow four properties called the
ACID properties i.e. atomicity, consistency, isolation, and durability.
Atomicity
Transaction ensures either modification is committed or not committed.
Consistency
The data should be in consistent state when transaction process is completed. This means that
all related tables are updated.
Isolation
SQL server supports concurrency when mean that data can be access or shared by many users.
A transaction works in isolation and doesn’t allow other transaction to work concurrently on the
same piece of data.
Durability
Data is permanent once transaction is completed and it can be recovered even if system fails.
Read uncommitted
Read committed
Repeatable read
Serializable
This is the lowest isolation level which can also be called as dirty read. Using this, you can read
uncommitted data which can be rolled back at any point. With this level, SQL server uses share
lock while reading data.
With this level, uncommitted data can’t be read. This is default isolation level and uses shared
lock while reading data.
b. Mixed Mode (Windows Authentication and SQL Server Authentication): uses either windows or
SQL server
Raise error is used to produce an error which is user defined or used to invoke an existing error
present in sys.messages. They are most commonly used in procedures when any condition fails
to meet.
Example:
@@error is used to hold the number of an error. When a T-SQL statement is executed, @@error
value is set to 0 by the SQL server. If an error occurs, the number of that error is assigned as a
value.
Example: the value of @@error can be checked for “0” value to be safe.
Databases
• Database organizes the data into a table. It is made up of several tables with rows and columns.
Logical Component
Physical component
Log files
It contains log information which is used to recover database.
Every SQL Server instance has primarily 4 system database i.e. master, model, tempdb and
msdb. All other databases are user created databases as per their needs and requirements.
A single SQL Server instance is capable of handling thousands of users working on multiple
databases.
What is transact-SQL? Describe its types?
Types of Transact-SQL
SQL Server Provides three types of Transact-SQL statements namely DDL, DCL, and DML.
Table
SQL Server database stores information in a two dimensional objects of rows and columns called table.
Data types
Data types specify the type of data that can be stored in a column.
Data types are used to apply data integrity to the column.
SQL Server supports many data type like character, varchar, integer, binary, decimal, money etc.
You can also create your own data type (User defined data type) using system data type.
Function
Index
Index can be thought as index of the book that is used for fast retrieval of information.
Index uses one or more column index keys and pointers to the record, to locate record.
Index is used to speed up query performance.
Kind of the indexes are clustered and non-clustered. Both exist as B-tree structure.
Clustered index exists as sorted row on disk.
Clustered index re-orders the table record.
Clustered index contains record in the leaf level of the B-tree.
There can be only one Clustered index possible in a table.
Non-clustered is the index in which logical order doesn’t match with physical order of stored data
on disk.
Non-clustered index contains index key to the table records in the leaf level.
There can be one or more Non-clustered indexes in a table.
Unique index is the index that is applied to any column of unique value.
A unique index can also be applied to a group of columns.
Constraint
Rule
Rule is older version of check function. You can apply only one rule to the column. You should
first create rule using ‘Create Rule’ statement and then bind the rule to the column using
sp_bindrule system stored procedure.
Default
It ensures default value to the column if you do not specify a value to the column while inserting a
row.
Stored Procedures
Trigger
A trigger is a special type of event driven stored procedure.
It gets initiated when Insert, Delete or Update event occurs.
It can be used to maintain referential integrity.
A trigger can call stored procedure.
View
The process of copying/moving data between databases on the same or different servers.
Snapshot replication,
Transactional replication,
Merge replication
DBCC is database consistency checker. DBCC commands are used to check the consistency of
the databases.
DBCC CHECKDB - Ensures that tables and the indexes are correctly linked in the database.
DBCC CHECKALLOC - Ensures all pages are correctly allocated in the database.
DBCC SQLPERF - Provides report on current usage of transaction log in percentage.
DBCC CHECKFILEGROUP - Checks all tables file group for any damage
Define COLLATION.
Collation is the order that SQL Server uses for sorting or comparing textual data. There are three
types of sort order Dictionary case sensitive, Dictionary - case insensitive and Binary
UPDATE_STATISTICS updates the indexes on the tables when there is large processing of data.
If we do a large amount of deletions any modification or Bulk Copy into the tables, we need to
basically update the indexes to take these changes into account
A candidate key is one that can identify each row of a table uniquely.
Generally a candidate key becomes the primary key of the table. If the
table has more than one candidate key, one of them will become the
primary key, and the rest are called alternate keys.
A key formed by combining at least two or more columns is called
composite key.
What is a deadlock and what is a live lock? How will you go about
resolving deadlocks?
What are cursors? Explain different types of cursors. What are the
disadvantages of cursors? How can you avoid cursors?
Disadvantages of cursors: Each time you fetch a row from the cursor,
it results in a network roundtrip, where as a normal SELECT query
makes only one rowundtrip, however large the resultset is. Cursors are
also costly because they require more resources and temporary storage
(results in more IO operations). Furthere, there are restrictions on
the SELECT statements that can be used with some types of cursors.
(3)Attaching databases
(4)Replication
(5) DTS
(6)BCP
(7)logshipping
(8) INSERT...SELECT
(9)SELECT...INTO
An Extent is a collection of 8 sequential pages to hold database from becoming fregmented. Fragment
means these pages relates to same table of database these also holds in indexing. To avoid for
fragmentation Sql Server assign space to table in extents. So that the Sql Server keep upto date data in
extents. Because these pages are continously one after another. There are usually two types of extends:-
Uniform and Mixed. Uniform means when extent is own by
a single object means all collection of 8 ages hold by a single extend is called uniform. Mixed mean when
more then one object is comes in extents is known as mixed extents.
SSRS Questions
- Report Manager, Reporting Designer, Browser Types Supported by Reporting services, Report
server, Report server command line utilities, Report Server Database, Reporting Services
Extensibility, Data sources that is supported by Reporting Services.
- Report designing – The designing is done in Visual Studio Report Designer. It generates a class
which embodies the Report Definition.
- Report processing – The processing includes bringing the report definition with data from the
report data source. It performs on all grouping, sorting and filtering calculations. The expressions
are evaluated except the page header, footer and section items. Later it fires the Binding event
and Bound event. As a result of the processing, it produces Report Instance. Report instance
may be persisted and stored which can be rendered at a later point of time.
- Report Rendering: Report rendering starts by passing the Report Instance to a specific
rendering extension (HTML or PDF formats). The instance of reports is paged if paging supported
by output format. The expressions of items are evaluated in the page header and footer sections
for every page. As a final step, the report is rendered to the specific output document.
Business Intelligence
1. A set of tools that can be used to create manage and view reports
2. A report server component that host and processes reports in veriety of formats.
3. An API that allows you to integrate or extend data and report processing in application
or to create custom tools to build or manage reports.
Reporting Services runs as a middle-tier server, as part of your existing server architecture.
SQL Server 2000 should be installed for the database server, and Internet Information Services
6.0 as a Web server.
The report server engine takes in report definitions, locates the corresponding data, and produces
the reports.
Interaction with the engine can be done through the Web-based Report Manager, which also lets
you manage refresh schedules and notifications.
End users view the report in a Web browser, and can export it to PDF, XML, or Excel.
Report manager is a web application. In SSRS it is accessed by a URL. The interface of this
Report manager depends on the permissions of the user. This means to access any functionality
or perform any task, the user must be assigned a role. A user with a role of full permissions can
entire all the features and menus of the report. To configure the report manager, a URL needs to
be defined.
Replication
Replication is a set of technologies for copying and distributing data and database
objects from one database to another and then synchronizing between databases to
maintain consistency. Using replication, you can distribute data to different locations
and to remote or mobile users over local and wide area networks, dial-up
connections, wireless connections, and the Internet.
Snapshot Replication
Snapshot replication simply takes a "snapshot" of the data on one server and moves that data to
another server (or another database on the same server). After the initial synchronization
snapshot, replication can refresh data in published tables periodically—based on the schedule
you specify. Although snapshot replication is the easiest type to set up and maintain, it requires
copying all data each time a table is refreshed.
Between scheduled refreshes, data on the publisher might be very different from the data on
subscriber. In short, snapshot replication isn't very different from emptying out the destination
table(s) and using a DTS package to import data from the source.
Transactional Replication
Transactional replication involves copying data from the publisher to the subscriber(s) once and
then delivering transactions to the subscriber(s) as they occur on the publisher. The initial copy of
the data is transported by using the same mechanism as with snapshot replication: SQL Server
takes a snapshot of data on the publisher and moves it to the subscriber(s). As database users
insert, update, or delete records on the publisher, transactions are forwarded to the subscriber(s).
To make sure that SQL Server synchronizes your transactions as quickly as possible, you can
make a simple configuration change: Tell it to deliver transactions continuously. Alternatively, you
can run synchronization tasks periodically. Transactional replication is most useful in
environments that have a dependable dedicated network line between database servers
participating in replication. Typically, database servers subscribing to transactional publications
do not modify data; they use data strictly for read-only purposes. However, SQL Server does
support transactional replication that allows data changes on subscribers as well.
Merge Replication
Merge replication combines data from multiple sources into a single central database. Much like
transactional replication, merge replication uses initial synchronization by taking the snapshot of
data on the publisher and moving it to subscribers. Unlike transactional replication, merge
replication allows changes of the same data on publishers and subscribers, even when
subscribers are not connected to the network. When subscribers connect to the network,
replication will detect and combine changes from all subscribers and change data on the
publisher accordingly. Merge replication is useful when you have a need to modify data on
remote computers and when subscribers are not guaranteed to have a continuous connection to
the network.
The Full Recovery Model is the most resistant to data loss of all the recovery models.
The Full Recovery Model makes full use of the transaction log – all database operations
are written to the transaction log. This includes all DML statements, but also whenever
BCP or bulk insert is used.
For heavy OLTP databases, there is overhead associated with logging all of the
transactions, and the transaction log must be continually backed up to prevent it from
getting too large.
Benefits:
Disadvantages:
The Bulk-Logged Recovery Model differs from the Full Recovery Model in that rows
that are inserted during bulk operations aren’t logged – yet a full restore is still possible
because the extents that have been changed are tracked.
• SELECT INTO
• bcp and BULK INSERT
• CREATE INDEX
• Text and Image operations
Benefits:
Disadvantages:
The simple recovery model is the most open to data loss. The transaction log can’t be
backed up and is automatically truncated at checkpoints. This potential loss of data is
makes the simple recovery model a poor choice for production databases. This option can
take up less disk space since the transaction log is constantly truncated.
Benefits:
Disadvantages:
Subscription Overview
A subscription is a standing request to deliver a report at a specific time or in
response to an event, and then to have that report presented in a way that you
define.
subscriptions can be used to schedule and then automate the delivery of a report.
select convert(nvarchar(100),dateadd(dd,-
datepart(dd,getdate()),getdate()),102)
select dateadd(dd,-datepart(dd,getdate()),getdate())