Databse Miscellaneous Topics
by JustEtc Publications Ltd
Distributed by justEtc |
Firebird as an Alternative to Oracle/MS SQL Server/Interbase/Access
Note:
Firebird can be a very good alternatives to Oracle/MSSql Server/Interbase. First of all, it is free. It can be good enough for upto medium level of enterprises. Supported database size is more than 11 terabytes. For a single table 20GB. It also implements ACID properties well enough. Implements two phase commit and in record level locking and hence can provide more concurrency and connection.
It also implements MGA to provide recovery that also Oracle/MSSQL server imitated from interbase/firebird.
A well designed database and a well designed network using Firebird will be strong enough to support a mid - size company. Also, if the database is designed well and used the clean normalization, and used clean indexing, the query will be faster enough.
Linux platform is more suitable than windows platform for Firebird.
In 32 bit environment, Firebird server version support 2 GB of ram and hence around 450 concurrent connection. This is due to memory addressing in 32 environment, required cache size and for similar reasons.
In Linux 64 bit versions Firebird can provide better service and more concurrent connections. Firebird is not stable and reliable enough to be used with windows platform in 64 bit environment. Though, Firebird project, hopes to work on this.
Firebird, has a dedicated community to support the product. They provide supports through a discussion forum. (May be .. with some other ways). Firebird team also hopes to continue the project to fullfill future demand.
Being opensource project the database system seems will be strong enough and well designed and well featured to meet customer demand.
For details please check:
A detail discussion can be found here.
http://www.ibphoenix.com/main.nfs?a=ibphoenix&page=ibp_enterprise_firebird
Note:
Firebird can be a good option for a Midsize company in Bangladesh
Features of Firebird
What is firebird:
The software has two main components: the database server, which lives on the same host machine as the databases, and the
application interface, commonly referred to as “the client library”. The client runs in the client pc or in the middleware
for accessing through internet browsers.
The server's executable is less than 1.5 Mb and a full server installation, including all tools and documentation, takes
up less than 10 Mb.
Versions:
1.5 most stable and tested
firebird 2: Mid of year 2006
Vulcan: A parallel project used by SAS to move many of its business applications from Oracle to Firebird.
Purpose: redesigning the threading architecture of the database engine
Is Firebird “Enterprise Capable”?
stability, scalability, availability, capacity, interoperability and autonomy.
Stability
Stable enough, supports strongly ACID properties
Scalable??
Good enough but not extremely scalable
Firebird's upscaling is merely a question of adapting the environment. The same engine comfortably handles anything from
being embedded in a stand-alone client application, through to a classical two-tier client/server LAN of around 750
potential users, to incorporation in a multi-tier solution for thousands of potential clients. Database growth is
effectively limited only by the disk storage available and can be split across multiple hard disks.
Through smart replication and good connection management in the access layers, the workload of a busy system can be
distributed across multiple servers. For example, a well-resourced central server can handle the interactive demands of
LAN, intranet or extranet (or all together) while a replicated server takes care of long-running jobs that need to isolate
a snapshot of data for lengthy periods.
Availability
Among its users, Firebird has a reputation for being bomb-proof.
It uses optimistic locking at record level, drastically reducing the wait-time overheads in comparison to others where
read-write transactions lock entire sets, even tables, pre-emptively. No tuning is ever required to facilitate handling
varying workloads through the day or week. A database does not have to be shut down for backups. It can be replicated or
shadowed for almost instantaneous cutover in the event of disk failure. It is robust and recovers immediately from power
failure, without loss of database integrity.
Supports online backup.
Firebird is a popular choice for enterprises needing continuity of service around the clock. Command-line tools are
distributed with the software for all administrative activities, allowing regular housekeeping to be automated as
scheduled or on-demand jobs. A Services API is also available to wrap admin tasks into a program or service application.
Firebird Databases as the Back-end to Enterprise Software Systems
Who Uses Firebird?
Because Firebird is free, there are no licences to count, no beans to count. It is known, from reputable enterprise surveys, that Firebird is chugging away on hundreds of thousands of production sites around the world. The following is a selection of companies and organisations that are publicly known to be using Firebird:
Broadview Software Ltd, Toronto, Canada, vendor of information and control systems and online services for broadcasters worldwide
Morfik P/L, Hobart, Tas., developers and vendors of WebOS development suite for construction and maintenance of interactive websites, stores web objects in a Firebird meta-layer (system database) as well as Firebird user data.
Communicare Systems Pty Ltd, Perth, WA, vendor of patient management and medical records software for hospitals, clinics, medical practices and mobile health units across Australia.
“The Examiner” newspaper, Launceston, Tas., high availability(24/7) business, information, production and news systems.
U.S. Navy, broad range of management and logistical systems
Frontrange Solutions USA Inc., Colorado Springs, U.S.A, as the back-end of the highly scalable, award-winning integrated CRM, service management and business systems “Goldmine” software suite.
British Rail, U.K., timetabling, bookings, accounting and information systems for national railway passenger network.
Deutsche Presse-Agentur GmbH, HQ in Hamburg, Germany, largest press agency in Germany, provides a worldwide service to newspapers, magazines, TV and radio news networks.
KIMData, Munich, Germany, business intelligence systems and data warehousing for German hospitals.
Comparison: FireBird, MySQL, PostGreSQL
PosTGRESQL Provides better query responsetime than firebird and MySql. Firebird stands somewhere in the middle. A
little longer response time than PostGreSql.
http://benchw.sourceforge.net/benchw_results_open3.html
Several PostGreSQL versions are compared at:
http://benchw.sourceforge.net/benchw_results_postgres_history.html
Several MySQL versions are compared at:
http://benchw.sourceforge.net/benchw_results_mysql_indexes.html
A good comparison among these databases will be found at:
http://www.geocities.com/mailsoftware42/db/
Mentionable comments
PostGreSQL Supports 400+GB DB.
One source: As for each connection it forks new thread, it is bit slower than MySql: Another source: postgresql is faster
than mysql. Seems the truth is: for single connection postgresql can be faster but for multiple connections mysql is
faster. just an information, not sure yet.
New PostGre Features
http://www.postgresql.org/docs/whatsnew
Some mentionable comments: Supports SMP and almost linearly improves performance witth multiprocessor system.
Supports roles, in/out parameters/two phase commit/table partitioning shared row locking(firebird record level
locking), 64 bit shared memory hence supports two terabytes of ram
PostGreSQL capability
4.4) What is the maximum size for a row, a table, and a database?
These are the limits:
Maximum size for a database? unlimited (32 TB databases exist)
Maximum size for a table? 32 TB
Maximum size for a row? 400 GB
Maximum size for a field? 1 GB
Maximum number of rows in a table? unlimited
Maximum number of columns in a table? 250-1600 depending on column types
Maximum number of indexes on a table? unlimited
One limitation is that indexes can not be created on columns longer than about 2,000 characters. Fortunately, such indexes
are rarely needed. Uniqueness is best guaranteed by a function index of an MD5 hash of the long column, and full text
indexing allows for searching of words within the column.
Source:
http://www.postgresql.org/docs/faqs.FAQ.html
4.20) What replication solutions are available?
Though "replication" is a single term, there are several technologies for doing replication, with advantages and
disadvantages for each.
Master/slave replication allows a single master to receive read/write queries, while slaves can only accept read/SELECT
queries. The most popular freely available master-slave PostgreSQL replication solution is Slony-I.
Multi-master replication allows read/write queries to be sent to multiple replicated computers. This capability also has a
severe impact on performance due to the need to synchronize changes between servers. PGCluster is the most popular such
solution freely available for PostgreSQL.
There are also commercial and hardware-based replication solutions available supporting a variety of replication
models.
Important Notes
Server Configuration
A number of postgresql.conf settings affect performance. For more details, see Administration Guide/Server Run-time
Environment/Run-time Configuration for a full listing, and for commentary see
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html and
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.
Hardware Selection
The effect of hardware on performance is detailed in http://www.powerpostgresql.com/PerfList/ and
http://momjian.us/main/writings/pgsql/hw_performance/index.html.
EnterpriseDB kind of commercial but cheap based on postgressql
PostGreSQL Limitations and solutions
http://searchopensource.techtarget.com/originalContent/0,289142,sid39_gci1172668,00.html
PostGreSQL Windows Installer
http://pgfoundry.org/projects/pginstaller/
Important Links for Mysql, PostGreSQL, Firebird, EnterpriseDB, Ingres
Firebird and postgresql:
http://www.linuxjournal.com/node/7010
Important comment: easy to migrate from oracle to firebird than to postgre or mysql. From interbase to firebird will be better as they are similar.
http://forums.devshed.com/firebird-sql-development-61/mysql--vs--firebird-sql-62269.html : Firebird is far better choice than mysql
As for Performance, MySQL will outperform Firebird on almost all tests on local machine......
On a Network Server a properly designed Firebird Server will outperform MySQL on any Test and the network-Traffic generated by Mysql will be many times higher than Firebird
So you decide the one you need according to your needs .....
This link also compares fb with potgre and favors fb as the author's background is borland
Why Postgress is better than firebird
Firebird is not as good as PostgresQL (Score:0)
By Anonymous Reader on 2004.12.03 2:21 (#102703)
And here is why:
1. Firebird has no temp table support.
2. Firebird uses several SQL modes i.e DSQL which only works client side and PSQL which only works in procs. All sql works in Postgres functions.
Try creating a table or a user in a Firebird proc...ooops can't be done.
3. Firebird databases grow out of control and have to be backed up restored on a regular basis.
4. Firebird has a concept of a OAT (oldest active transaction counter) if this counter gets stuck all transactions get held up and your DB size goes through the roof and performance goes to the basement.
5. Firebird has virtually no built in functions and flakey UDFs must be used. Postgres on the other hand has every function you can think of.
6. Postgres has many rich procedural languages including perl,TCL,Java and C# in the works.
I could go on.... Version 8 of Postgres which will be released soon adds point in time recovery, Try except error handling in functions, table spaces and more.
I converted all my apps from Firebird 1.5 to Postgresql 7.4 and I would never use Firebird again on Unix or Windows
Why PostGress better than FireBird: This comparison may not be accurate though.
Postgres Vs Interbase/Firebird (Score:0)
By Anonymous Reader on 2004.12.03 18:23 (#102740)
http://www.vitavoom.com/postgresql.html
Firebird has not true text type you must use the clumsy blobs and weird blob subtypes.
Postgres is much closer for enterprise use than Firebird.
almost all of this applies to Firebird as well as interbase.
PostgreSQL vs Interbase
PostgreSQL...
* ... has no "dialects" or other weird features.
* ... has no "backdoors" which can compromise your whole company's security and business (check here and here).
* ... supports unlimited row sizes, unlimited datasebase sizes, tables up to 16TB (!!), unlimited number of rows, unlimited indexes for table and 1Gb per field.
* ... has a boolean field.
* ... can cancel a query asynchronously (Interbase 6.5 - the commercial version - seems to support it now).
* ... has a flexible (not fixed) types system, and supports more types then Interbase. You can always add your own types at your wish, or change existent types behavior. PostgreSQL even has geometrical and IPv4 types support (!!).
* ... supports inheritance.
* .. supports flexible full text indexing through OpenFTS.
* ... has a much more sofisticated locking mechanism (MVCC).
* ... has arbitrary precision numeric fileds (numeric type).
* ... is resistant to crashes and power failures (by using it's logging system, MVCC and chepoint)). Although it is technically possible that a database gets corrupt, we at Vita voom never saw it (except for corrupt media storage of course).
* ... supports functions (whose can be used as stored procedures). These functions current can be written on SQL, pl/PgSQL (a language similar to Oracle's pl/SQL), TCL, Perl, and Python, C, C++ (or other compiled languages) but it's not limited to them.
* ... can have functions to define default values for columns (providing ultimate default value flexibility).
* ... supports much more arrays types then Interbase, they are more flexible and can support much more elements.
* ... supports rules.
* ... supports "fetch" SQL command to get only a limited number of rows at a time, making queries more responsive and resource economic.
* ... has regular expressions support (for searches and operations).
* ... has a 'EXPLAIN' command which will show will how it will perform a query, so that you optimize it.
* ... has statistics about database usage which can be used to optimize queries and indices.
* ... has sequences.
* ... has more built-in functions then Interbase (or any other open source RDBMS).
* ... supports indexes on functions.
* ... has broader subselects support then Interbase.
* ... has a more flexible BLOB fields support.
* ... can limit the number of rows retrieved at at time (with the 'LIMIT' keyword).
* ... is more standards-compliant then Interbase.
* ... allo
Speed: Firebird vs. PostGreSql
Please read.
According to this article postgresql faster than firebird. But I also got information/tests where firebird is faster than postgresql.
http://archives.postgresql.org/pgsql-advocacy/2003-06/msg00280.php
However, Speed may not be the only factor to select a database product. We can improve raw speed in many alternate ways, hardware solutions, ram, faster storage, faster CPU. To compare speed issue, efficiency issue, we need to take a look how the query optimizations are implemented in databases. How, the joins, inner joins, outer joins are implemented. How multiple joins are implemented. Also, indexing is a factor that also affects performance issue. We can try creating same database and using a queries that are simple to complex and run on both databases and see, how well they perform.
Postgresql supports SMP (multiple processor system) and provides better efficiency in multiprocessor systems. Firebird, is not still matured in SMP environment.
Firebird supports 32 bit environment, still not stable/reliable enough for 64 bit env. It supports max ram of 2 gb and hence 150 - 450 concurrent users. Postgresql supports 2 TB of ram in a single server. For both of them, we can implement multiple servers with master/slave relation for better performances.
Tools to create ER diagrams from a database dynamically
Microsoft Visio, and dbVisualizer provide features to create ER diagrams from existing database tables.
Why needed?
ER diagram is very useful to understand the total
database structure that helps a new programmer to work on the existing databases more efficiently.
Visio:
While creating Database diagrams, a menu item named database is displayed. Reverse Engineering option under Database menu helps to create ER diagram from
existing database tables. Before that, target database drivers for visio need to be installed. If the database is in db2/mysql/postgresql, then
db2/mysql/postgresql drivers for visio should be installed. From Database/options/drivers, the driver needs to be configured. Afterwards, Reverse Engineering
option may be used to create the ER diagram. There is no suitable driver for postgresql databses for Visio. ODBC drivers may be used for the purpose (the ER
diagram may not be accurate from ODBC driver for postgresql).
dbVisualizer:
In dbVisualizer, when the TABLEs group/option is selected in the left pane in the right/middle pane there will be a references option. When the references
tab is selected, an ER diagram is displayed using all tables. The ER diagram can be displayed as Hierarchical, circular, orthogonal, or organic. There are
options for the purpose.
When a single table name is selected, the table and the associated tables are displyed in an ER diagram.
In the rightest column, selected tables option allows to create ER diagrams only for the selected tables. When 'selected tables' is clicked a list is
displayed in a dialogbox. Tables can be selected from this list (use control key). Build graph option creates and displays the ER diagram.
Normalization (1NF to 5th NF)
Please check this link.
http://www.keithjbrown.co.uk/vworks/mysql/mysql_p7.php
DBMS System Online Course
Please check the resources at:
http://www.cs.sfu.ca/CC/354/zaiane/material/notes/contents.html
Just like an undergraduate course in Database Management Systems. Along with other topic, ER-Diagram and representation of ER-Diagrams into tables are provided. Some of the complex forms of ER diagrams are discussed.
Practical Database Design Resources
Normalization in Relational DBMS Systems
First Normal Form:
All table columns should have distinct meaning
In another word: All attributes of an entity/table must be unique
Solution:
Grow the table row wise[in the direction of rows] not the column wise
Group the columns with same meaning into one
To handle multiple values for the same attribute of an entity create multiple rows [not multiple columns]
To keep track of paycheques, you may think the table should be as follows:
Name, provider company, date 1, pay cheque 1, date2 , pay cheque 2, date 3, paycheque 3
See date1, date2, and date 3 have the same meaning
Pay cheque 1, Pay cheque 2, and Pay cheque 3 have the same meaning
These are the repeating attributes
Solution:
create table with:
number name, company, date, paycheque
100 1, xyz, 12/13, 1000.00
100 1, abc, 12/13, 1000.00
100 1, xyz, 12/31, 1000.00
or
create a child entity with Date and amount. And a master entity with number, Name and Company
Entity one: number, name, company
Entity two: number, date, amount
2nd Normal form:
----------------
Rule:
All attributes can be identified from the primary key. Primary key is directly related to all other attributes.
All attributes are fully dependent on the primary key.
Line Item Table/Entity:
--------------------
number [line item] [pk]
order number[pk]
vendor name
vendor town
product code
product amount
Here, vendor name and vendor town may not be fully dependent on the whole primary key. They are dependent on the order number.
So we can decompose it into another table.
Order entity
Number [pk]
Vendor Name
Vendor Town
Lineitem entity
number [pk]
order_number [pk]
product code
product amount
3rd normal form
-----------------
Rule: Non key attributes are fully dependent on the primary key but not on any other key or attribute.
In Order entity table, vendor town is not fully dependent on number[pk], but it does depend on the vendor name. So we can decompose order entity table into two
Vendor Entity
Name [pk]
Town
Order Entity
Number
vendor-name
Boyce/Codd and Fourth Normal Form
Boyce/Codd Normal Form
----------------------
It is just an extension to the third normal form. Third normal form ensures that non key attributes does not depend on any non key attribute but fully depend on the key[primary] attributes.
Boyce codd normal form ensures that non key attributes are fully dependent on the total set of the primary key. Non-key attributes should not
depend [only] on a subset of the key[primary] attributes.
Fourth Normal Form:
-------------------
It tries to decompose an entity/table into multiple entities/tables when there are multiple independent[not dependent on each other] multivalued attributes in an entity/table.
For example: Consider a table with emp_number, skill name, objective of the employee. Employees may have many skills and also many objectives. Hence, skill_name, and objective are multivalued and also they do not depend on each other. This table will cause many redundant data.Better is to use two tables like:
Emp_skill
Emp_num, Skill Name
Emp_objective
Emp_num, Objective
Logical Data Modeling: Logical Database Design Steps: RDBMS
Logical Data Modeling
- Identify major entities
- Determine relationships between entities
- Determine primary and alternate keys
- Determine foreign keys
- Determine key business rules
- Add remaining attributes
- Validate user views through normalization
- Determine domains
- Determine triggering operations
- Combine user views
- Integrate with existing data models
- Analyze for stability and growth
Translate Logical Model into the Real Database System
- Identify tables
- Identify columns
- Adapt data structure to product environment
- Design for business rules about entities
- Design for business rules about relationships
- Design for additional business rules about
attributes
- Tune for scan efficiency
- Define clustering sequences
- Define hash keys
- Add indexes
- Add duplicate data
- Redefine columns
- Redefine tables
Design for Special Design Challenges
- Provide for access through views
- Establish security
- Cope with very large databases
- Access and accommodate change
- Anticipate relational technology evolution
Reference: C. C. Fleming and B. V. Halle
Query Optimization in PostGreSQL
check: http://www.postgresql.org/docs/7.3/interactive/geqo.html
"optimization effort is caused by the support of a variety of join methods (e.g., nested loop, hash join, merge join in PostgreSQL) to process individual joins and a diversity of indexes (e.g., R-tree, B-tree, hash in PostgreSQL) as access paths for relations. "
"The current PostgreSQL optimizer implementation performs a near-exhaustive search over the space of alternative strategies. This query optimization technique is inadequate to support database application domains that involve the need for extensive queries, such as artificial intelligence. "
Query Optimization in Firebird
Please check: http://www.ibphoenix.com/main.nfs?a=ibphoenix&s=1154575297:5422&page=ibp_50_query
What is Data Warehousing?
Data Warehouse is a storage of an organization's historical data. Usually, data warehouse is built to provide a decision support system to the management. Data warehouse can be mined (using data mining) and queried to collect overall and summarized business information that significantly helps in decision making. It can provide trends over many years.
Operational databases are optimized for day to day operation access. Here, the relationships are normalized and further divided into more relations/tables to make accessing faster that also reduces data traffic among the systems. But data warehouse does the reverse, where relationships/tables are de-normalized as much as possible to provide the required information to the management. As datawarehouse provides summarized information de-normalization helps, otherwise the data would be required to collect from many tables and the queries may be running over data for 10 years (millions/billions of records). Usually, data warehouse is created separately from the operational databases.
Microsoft SQL Server and Oracle both have supports and tools to create Data warehouse Systems. There are also free tools for the purpose.
Random Information on BI
- Real internet money/business lies in B2B not in
B2C.
- Decision Support System (DSS): Full spectrum of
systems that allow/help management to take
decisions, such as, reporting, OLAP, data mining.
- Java provides support for web enabled DSS
- JDM is Java's Data mining/BI API.
- BI Applications: Balanced Score-card, Activity Based
Costing.
- Data warehouse elements
- Operational Environment: Maintains day to day
activities. Raw Data
- Independent data mart: A small data ware house
specific to one topic/area/subject
- Extraction: Extraction of data from operational
systems
- Extraction Store: Data storage after extraction
- Transformation and cleansing: Before putting
data into the central/main warehouse data needs
to be cleaned up
- Extraction log: Create a log in the
transformation and cleansing process
- External source: Data originated outside the
organization is also incorporated into the data
warehouse.
- Data administrator: The person who ensures the
quality of data
- Central Repository: Stores all data and meta
data
- Meta data: Data about data. Example: the
extraction log.
- Dependent Data Mart: Relies on the data
warehouse for the sources of its data
- Distributed Data Warehouse: Warehouse data is
physically distributed across a company's
information infrastructure.
- Data mining types: classification, and estimation.
- Classification: categorize customers, Estimation: predict or estimate some numerical value
- Decision trees: modeling technique helps in
deciding. Different alternatives are drawn as paths
where nodes represent the selected alternative.
edges/paths may carry some probability values as
selection/influencing criteria.
- Neural networks: Try to mimic the work nature of
human brain to take a decision.
- General modeling: Well-suited for categorizing.
- Balanced Scorecard application: views the health and
well-being of the organization by more than its
financial data. We need to define: Vision, Strategy,
Strategic themes, Strategic Objectives, Key
performance indicators.
- Activity based costing application:
- Three types of organizations: Engineer driven: They know how to make a product. Than sales it through sales force. Sales Driven: They know how to sale a particular product, then make the product and sell it. Customer driven: They Know what customers want, then make the products and sale them. Customer driven organizations are the most successful.
How to implement BI/Warehouse
Steps in Data Warehouse Design and Implementation
- Most BI projects usually fail. It is not due to the errors in construction steps but due to the inappropriate design steps and methodologies.
- Waterfall method is not appropriate for BI. A method like developmental spiral may be more appropriate. Ref: [W. A. Giovinazzo ]
- Steps in developmental spiral: Definition, Analysis, Design, Development, Implementation/Deployment, Maintenance, Evaluation
- Developmental spiral method is effective for BI only when object oriented approach is taken
- In developmental spiral, object modeling is used to represent objects in the related business areas. In object modeling, objects are represented as rectangles, the rectangles can be treated as classes. Classes/objects can have sub-classes/sub-objects. Objects have attributes. Objects also have some special properties like: accelerators, stops.
- Definition - first step of developmental spiral: Define a short description what you want to build - like the problem statement in a research proposal. Example:
I want to build a data mart that will provide data to support the analysis of customer demographics such as age, gender, and income for all product lines
- Analysis Phase: In this step, analyze the requirements of the BI system, the nature of the BI system, the expectations from the BI system. Interviewing different parties will help a lot:
- Interview the sponsor to get a detail understanding of the mission. The questions will include: Data-related(which objects are related to the business needs?), user-related: who are the users, system evaluation: what parameters will make the project a success?
- Interviewing management: To understand management's expectation of the data warehouse. Questions may include: User profile related (what types of systems you use/will use to access BI?), System vision: what do you see as the objective of the data ware house, System Evaluation: what parameters will make the project a success?
- Interviewing Users : Interviewing users who will use the system
- Interviewing information technology staffs: To know the current state of the organization's information systems
- Design phase: output - object model, implementation model
- Start with the analysis phase information. The design phase is complete when all the objects related to the mission are well defined. An object diagram needs to be drawn showing all objects, relationships, attributes.
- Another chart listing all the attributes and their data types will be useful. Objects will have relations like super classes or sub-classes.
- Cardinality relationships: exactly one, one to many, zero or one, zero to many
- aggregation may be required in some cases. Aggregation: an object is composed of other objects.
- Implementation Model: Data warehouse databases are multi dimensional databases. Objects are represented in more than two dimensions. Common practice is: three dimensional databases. Such as an object/table represented with three dimensions (called cube) like: product, dealer, and time. RDBMS's are two dimensional.
- Data cubes: have six possible different views [permutation of three dimensions].
- Why not always multidimensional databases: because of space requirements.
- Star schema: Provides a multidimensional flavor in two dimensional relational databases.
- Star schema uses a concept called fact tables to bind dimensions to create a multidimensional space.
- Denormalization of the tables are utilized in star schema to create multi dimension.
- Dimension table: Think about a three dimensional cube. Each wall represents a dimension table. Or think it as a mathematical combination of the dimensions.
- All objects represented in the dimensions can slowly change over time - slowly changing dimensions. There are many approaches to address this issue.
- Snowflakes: normalizing dimensions - not always a great thing
- Implementation considerations: Parallel processing, Bitmapped Indexing, Star Query Optimization, Summation Tables
-
MySql Administration SQL Commands
Using MySQL, Administration
Workshop Requirements
You should have access to the MySQL command line client software.
Various different PRIVILEGES on the MySQL Server
Introduction
In the other MySQL Virtual Workshops we have used commands that are pretty much applicable to anyone. This part of the MySQL series is aimed at giving a rudimentary understanding of managing a MySQL database server. As such the task covered here are not really about manipulating data or database structures, but the actual databases themselves.
Creating a Database
In order to create a database you need to have the PRIVILEGES- this may be because you are the root user or you (or you systems administrator) has created an admin user that has ALL PRIVILEGES over all databases. In these examples a user called 'admin' has been created precisely for this purpose. Creating a database is fairly straightforward.
Logging In
A reminder of how to start the MySQL Client Software, and as we are not concerned with manipulating just one database we don't have to specify a database as part of our startup command.
$ mysql -u -p
Enter password:Create database command
Next we are ready to enter the very simple command to create a database which is:
mysql> CREATE DATABASE ; Let's imagine that we are going to create a 'vworks' database (those wishing to create a database for use with the VWs should use this). We would enter the command:
mysql> CREATE DATABASE vworks; We can now check for the presence of this database by typing:
mysql> SHOW DATABASES;
+-----------+
| Database |
+-----------+
| mysql |
| vworks |
+-----------+
2 rows in set (0.06 sec)The other database listed ('mysql') is the internal database which MySQL uses to manage users, permissions etc.
NOTE: Deleting or DROPing a database is similar to the DROP TABLE command issued in Part 4. e.g.
DROP DATABASE Granting Privileges on the new database
Now that we have created a database, we need to decide who gets to use it. This is done by granting permissions for a user to use the database. This has a simplified syntax of:
GRANT
ON
TO
[IDENTIFIED BY ]
[WITH GRANT OPTION]Where the items in square brackets are optional. The most common use is to give ALL PRIVILEGES on a database to a local user who has to use a password to access the database (in this case vworks).
mysql> GRANT ALL PRIVILEGES
-> ON vworks.*
-> TO newuser@localhost
-> IDENTIFIED BY 'newpassword'; If you are creating a database for use with the rest of the Virtual Workshops you should use this statement, substituting your username and password of choice. There are some other options we will look at. To restrict the user to manipulating data (rather than table or database structures) the statement would be altered to:
mysql> GRANT SELECT,INSERT,UPDATE,DELETE
-> ON vworks.*
-> TO newuser@localhost
-> IDENTIFIED BY 'newpassword'; So that the user can only change the data using SELECT,INSERT,UPDATE or DELETE statements. If you wished to give a non-local user permissions on the database (for use with remote clients) then you could designate an IP or host address from which the user can connect:
mysql> GRANT ALL PRIVILEGES
-> ON vworks.*
-> TO newuser@192.168.0.2
-> IDENTIFIED BY 'newpassword'; Now a user on the machine '192.168.0.2' can connect to the database. To allow a user to connect from anywhere you would use a wildcard '%'
mysql> GRANT ALL PRIVILEGES
-> ON vworks.*
-> TO newuser@'%'
-> IDENTIFIED BY 'newpassword';You could even decide that a user doesn't need a password if connecting from a certain machine.
mysql> GRANT ALL PRIVILEGES
-> ON vworks.*
-> TO newuser@192.168.0.2But I think it is sometimes good to provide a password anyway. Finally we'll look at the WITH GRANT OPTION condition. This allows the user to give others privileges to that database:
mysql> GRANT ALL PRIVILEGES
-> ON vworks.*
-> TO newuser@localhost
-> IDENTIFIED BY 'newpassword'
-> WITH GRANT OPTION; This would allow the user 'newuser' to log into the database and give their friend privileges to SELECT,INSERT,UPDATE or DELETE from the database.
mysql> GRANT SELECT,INSERT,UPDATE,DELETE
-> ON vworks.*
-> TO friend@localhost
-> IDENTIFIED BY 'friendpass'; The WITH GRANT OPTION usually signifies ownership although it is worth noting that no user can GRANT more privileges that they themselves possess.
Revoking privileges
Revoking privileges is almost identical to granting them as you simply substitute RE VOKE.... FROM for GRANT....TO and omit any passwords or other options.
For example to REVOKE the privileges assigned to a user called 'badvworks':
mysql> REVOKE ALL PRIVILEGES
-> ON vworks.*
-> FROM badvworks@localhost; Or just to remove UPDATE, INSERT and DELETE privileges to that data cannot be changed.
mysql> REVOKE INSERT,UPDATE,DELETE
-> ON vworks.*
-> FROM badvworks@localhost; Backing Up Data
There are several methods we can use to backup data. We are going to look at a couple of utilities that come with MySQL: mysqlhotcopy and mysqldump.
mysqlhotcopy
mysqlhotcopy is a command line utility written in Perl that backs up (to a location you specify) the files which make up a database. You could do this manually, but mysqlhotcopy has the advantage of combining several different commands that lock the tables etc to prevent data corruption. The syntax (as ever) first.
$ mysqlhotcopy -u -p /backup/location/Which SHOULD copy all the tables (*.frm, *.MYI, *.MYD) into the new directory - the script does require the DBI perl module though. To restore these backup files simply copy them back into your MySQL data directory.
mysqldump
This is my preferred method of backing up. This outputs the table structure and data in series of SQL commands stored in a text file. The simplified syntax is
$ mysqldump -u -p [] > file.sqlSo for example to back up a 'vworks' database which may have been created by completing the workshops:
$ mysqldump -u admin -p vworks > vworks.sql
After entering the password a 'vworks.sql' file should be created. When you look at this file you can actually see that the data and structures are stored as a series of SQL statements. e.g.:
-- MySQL dump 8.22
--
-- Host: localhost Database: vworks
---------------------------------------------------------
-- Server version 3.23.52
--
-- Table structure for table 'artist'
--
CREATE TABLE artist (
artistID int(3) NOT NULL auto _increment,
name varchar(20) default NULL,
PRIMARY KEY (artistID)
) TYPE=MyISAM;
--
-- Dumping data for table 'artist'
--
INSERT INTO artist VALUES (1,'Jamiroquai');
INSERT INTO artist VALUES (2,'Various');
INSERT INTO artist VALUES (3,'westlife');
INSERT INTO artist VALUES (4,'Various');
INSERT INTO artist VALUES (5,'Abba');
And so on for the other tables.
We could also have chosen to output just one table from the database, for example the artist table:
$ mysqldump -u admin -p vworks artist > artist.sqlWe could even dump all the databases out (providing we have the permissions).
$ mysqldump -u admin -p --all-databases > alldb.sqlRestoring a Dump
Restoring a dump depends on what you have actually dumped. For example to restore a database to a blank database (perhaps having transferred the sql file to another machine) it is fairly simple.
$ mysql -u admin -p vworks < vworks.sql...or to add a non-existent table to a database...
$ mysql -u admin -p vworks < artist.sqlHowever, what happens if we want to restore data to an existing database (perhaps a nightly backup) ? Well we would have to add other options:
The equivalent of overwriting the existing tables would be telling the dump to automatically drop any tables that exist before restoring the stored tables. This is done with the ' --add-drop-table ' option added to our statement.
$ mysqldump -u admin -p --add-drop-table vworks > vworks.sqlThen restore like normal:
$ mysql -u admin -p vworks < vworks.sqlThe reverse might also be true. We may wish to create the database if it doesn't already exist. To do this we use the '--databases' option to specify the database we wish to back up (you can specify more than one).
$ mysqldump -u admin -p --databases vworks > vworksDB.sqlThis will create additional SQL statements at the start of each database that CREATEs the dumped database (checking first to see if it does indeed exist) then USEing that database to import the table data into.
-- Current Database: vworks
--
CREATE DATABASE /*!32312 IF NOT EXISTS*/ vworks;
USE vworks;
Again we can resort like normal, but of course this time we can omit the database name.
$ mysql -u admin -p < vworksDB.sqlOptimising a dump
There are a couple of options that are sometimes worth including when backing up and restoring large databases.
The first option is '--opt', this is used override the mysql server's normal method of reading the whole result set into memory giving a faster dump. Example:
$ mysqldump -u admin -p --opt vworks > vworks.sqlThe second option is '-a' or '-all' (either will do). Which also optimises the dump by creating mysql specific CREATE statements that speeds up the restore:
$ mysqldump -u admin -p --all vworks > vworks.sqlUsing mysqldump to copy databases.
It is possible to combine a dump and a restore on one line by using a pipe '|' to pass the output of the dump directly to mysql basically bypassing the file. This may initially seem a bit redundant, but we can use this method to copy a database to another server or even create a duplicate copy.
For example to copy the 'vworks' database to a mysql server called 'remote.server.com':
$ mysqldump -u admin -p --databases vworks | \
> mysql -u backup -p MyPassword -h remote.server.com Note: the"\" at the end of the first line means you wish to contine the command on another line before executing it.
You may, in certain circumstances, wish to make a copy of live data so that you can test new scripts and 'real world' data. To do this you would need to duplicate a local database. First create the duplicate database:
mysql> CREATE DATABASE vworks2; Then once appropriate privileges have been assigned we can copy the tables from the first table into the second.
$ mysqldump -u admin -p vworks | mysql -u backup -p MyPassword vworks2Notice in both these examples the second half of the line (after the pipe) passes the password as part of the connection statement. This is because asking for two separate passwords at the same time breaks most shells. That is why I have used a 'backup' user who can be granted permissions and have them revoked as necessary.
Miscellaneous Leftovers
This final bit includes a few brief tricks that weren't really appropriate to include elsewhere, but are still worth noting.
Remote Client Connection
If you have set the privileges to allow remote connections to a database, you can connect from a remote command line client by using the -h flag:
$ mysql -u -p -h For example to connect to a fictional vworks.keithjbrown.co.uk server:
$ mysql -u admin -p -h vworks.keithjbrown.co.ukNon-Interactive Commands
Sometimes you may wish to just do a quick look up on a table without the hassle of logging into the client, running the query then logging back out again. You can instead just type one line using the ' -e ' flag. For example:
$ mysql -u admin -p vworks -e 'SELECT cds.artist, cds.title FROM cds'
Enter password:
+------------+------------------------------+
| artist | title |
+------------+------------------------------+
| Jamiroquai | A Funk Odyssey |
| Various | Now 49 |
| westlife | westlife |
| Various | Eurovision Song contest 2001 |
| Abba | Abbas Greatest Hits |
+------------+------------------------------+
Securing MySQL Database
After installing mysql, we will need to remove test database and associated users and their permissions. We will need to use mysql database to remove associated users and permission.
--
DROP DATABASE test;
--
SELECT db.Host, db.Db, db.User, db.Select_priv
-> FROM db WHERE (db.DB = "vworksDB");
---
SELECT db.User, db.Host, db.Db
-> FROM db
-> WHERE (db.Db LIKE 'test%');
---
DELETE FROM db
-> WHERE (db.Db LIKE 'test%');
---
mysql> DELETE FROM db
-> WHERE (db.Host = "%");
mysql> DELETE FROM db
-> WHERE (db.User = "");
---
SELECT user.Host, user.User
-> FROM user
-> WHERE ((user.Host = "%") OR (user.User = ""));
---
DELETE FROM user
-> WHERE ((user.Host = "%") OR (user.User = ""));
---
FLUSH PRIVILEGES;
---
details: http://www.keithjbrown.co.uk/vworks/mysql/mysql_pA.php#secure
Joins in MySQL
Cross-Join: All row by all row, ex: SELECT
FROM ,
The Equi-Join or Inner Join:
Syntax:
SELECT
FROM ,
WHERE (Table1.column = Table2.column)
SELECT cds.artist, cds.title, genres.genre
-> FROM cds, genres
-> WHERE (cds.genreID = genres.genreID);
The Left Join:
Syntax:
SELECT
FROM
LEFT JOIN
ON Table1.column = Table2.column
Without adding a WHERE clause the Left Join produces the same results as the equi-join example above.
mysql> SELECT cds.artist, cds.title, genres.genre
-> FROM cds
-> LEFT JOIN genres
-> ON cds.genreID = genres.genreID;
The USING Clause: You can use this if the columns you are carrying out the join on have the same name.
Syntax:
SELECT
FROM
LEFT JOIN
USING ()
UPDATE JOIN:
mysql> UPDATE cds, artists
-> SET
-> cds.title = 'The Funkship Odyssey',
-> artists.Artist = 'George Clinton'
-> WHERE (artists.artistID = cds.artistID)
-> AND (cds.cdID = '2');
UPDATE cds->LEFT JOIN artists
-> ON cds.artistID = artists.artistID
-> SET
-> cds.title = 'A Funk Odyssey',
-> artists.name = 'Jamiroquai'
-> WHERE (cds.cdID = '2');
DELETE Joins:
mysql> DELETE cds
-> FROM cds, artists
-> WHERE (cds.artistID = artists.artistID)
-> AND (cds.artistID = '3');
MySQL Resources
Mysql Resources:
First you need the latest/stable version of MySQL. Right now 6.0 is the latest version. Mysql has two different releases like:
community server[free]
enterprise server[commercial]
Note: MAXDB is the current full package with many features that may not be fully tested.
Download MySQL:
You can download MySQL from
http://dev.mysql.com/downloads/mysql/5.0.html
Check the lower portion of the webpage.
Working with Mysql:
After installation you can interact with MySQL server through command line[Best to become guru]. But you may also want to use GUI IDE to interact with. GUI tools can be downloaded from:
http://dev.mysql.com/downloads/gui-tools/5.0.html
GUI are of two types: to administer MySQL server. Used for backup, restore, security
Development: For creating Database and interact with database using tables, Queries, Stored procedures, triggers
Application Development:
You can build appications that use MySQL databases at the backend. You can use languages like PHP, Perl, Java, .net to interact with MySQL databases. Usually, the package for these languages contain a driver to provide the functionalities. Otherwise, you can download drivers from:
http://dev.mysql.com/downloads/connector/
for many different programming languages like Java, PHP, Perl and similar.
MySQL Resources: MySQL Start
Mysql Resources:
First you need the latest/stable version of MySQL. Right now 6.0 is the latest version. Mysql has two different releases like:
community server[free]
enterprise server[commercial]
Note: MAXDB is the current full package with many features that may not be fully tested.
Download MySQL:
You can download MySQL from
http://dev.mysql.com/downloads/mysql/5.0.html
Check the lower portion of the webpage.
Working with Mysql:
After installation you can interact with MySQL server through command line[Best to become guru]. But you may also want to use GUI IDE to interact with. GUI tools can be downloaded from:
http://dev.mysql.com/downloads/gui-tools/5.0.html
GUI are of two types: to administer MySQL server. Used for backup, restore, security
Development: For creating Database and interact with database using tables, Queries, Stored procedures, triggers
Application Development:
You can build appications that use MySQL databases at the backend. You can use languages like PHP, Perl, Java, .net to interact with MySQL databases. Usually, the package for these languages contain a driver to provide the functionalities. Otherwise, you can download drivers from:
http://dev.mysql.com/downloads/connector/
for many different programming languages like Java, PHP, Perl and similar.
Stored Procedure in MySql
Starting from MySQL 5, you get Stored Procedure in Mysql
What is a stored procedure: A stored procedure is simply a procedure that is stored on the database server like MySQL. In programming languages, you write procedures to execute a function/logic. You can write similar procedure in SQL and store it in the database. From the front end application you can just call the procedure to get the functionality. Usually, you send series of sqls to the databases to execute a logic. A stored procedure is better as it needs one single call. But not every logic can/should be implemented as stored procedures.
Sample stored procedure
CREATE PROCEDURE sp_hello()
SELECT 'Hello World';
It just creates a procedure. To run the procedure, you have to type
call sp_hello;
You will see 'Hello World' as output.
Stored procedures can also take parameters and return values. Both needs to be mentioned as parameter. In the parameter list we can declare a variable as IN, OUT, or INOUT parameter. You can use SQLs[insert,select] inside procedure to perform database operations. Also, you can use to set/unset session variables.
Session variable example
SET @X=100;
CREATE PROCEDURE sp_in(p VARCHAR(10))
SET @x = P;
call sp_in(1000);
You see 1000 in screen
A more practical stored procedure with multiple statements
mysql> DELIMITER |
mysql> CREATE PROCEDURE sp_declare (P INT)
-> BEGIN
-> DECLARE x INT;
-> DECLARE y INT DEFAULT 10;
-> SET x = P*y;
-> INSERT INTO sp1(id,txt) VALUES(x,HEX('DEF'));
-> END|
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER ;
mysql> CALL sp_declare(4);
please, note the use of | and DELIMETER. They provide support so that you can use ; in your procedures.
Some stored procedure related SQL commands
SHOW PROCEDURE STATUS
SHOW CREATE PROCEDURE Hello
SELECT * FROM INFORMATION_SCHEMA.ROUTINES [Ansi standard]
Thanks
Sayed
MySQL Reference Manual
MySQL: Special Interests
Database Administration : Random
Vacuum:
vacuum is a useful command to maintain PostGreSql database integrity. Also, it permanently removes deleted database records (by default PostGreSql does not delete permanently). It also rearranges page tables, segments for better performance. vacuum should be run regularly for optimized performance. You can also automate the execution.
Syntax: vacuum [FULL|Analyze|Verbose|table name]
Example:
vacuum member verbose
vacuum full
vacuum analyze full
vacuum also has linux command line command like
vacuumdb dbname
vacuum -a -f -v -z
--------
PostGreSql maintains a list of databases in the pg_database system table. Sometimes when it/the table gets corrupted, a full vacuum on all databases may fix the issue. When the table gets corrupted many operations like pg_dump may not work and you may get error messages like missing entry in the pg_database table.
----------
Backup command:
pg_dump options dbname > destination txt/tar file
Example: pg_dump -Ft -b mydb > db.tar
Restore Command:
Example: pg_restore -d newdb db.tar
--------------
Linux and PostGreSQL
commands to access PostGreSql:
su - postgres
password:
psql -l : list all databases
--------
How to optimize SQL Server Cursors
Optimize Cursors
- Avoid using SQL Server cursors whenever possible
- Always close SQL Server cursors when result sets are not needed
- Deallocate SQL Server cursors when the data structures comprising the cursors are not needed
- Reduce the number of records to process in the cursor
- Only use the required columns in the cursors
- Use READ ONLY cursors, whenever possible, instead of updatable cursors.
- Avoid using insensitive, static and keyset cursors, whenever possible.
- Use FAST_FORWARD cursors, whenever possible.
- Use FORWARD_ONLY cursors, if you need updatable cursors
What are cursors?
Cursors are server side database objects that are used by
applications to apply operations on the database table data
on a row-by-row basis. The operations may vary from one row
to another row dynamically based on the requirements
(business logic) and also multiple operations can be
performed on the same row. Typical SQL commands apply the
same updates/changes to all the selected rows at once where
cursors apply updates/changes one by one row.
Cursors must be declared in the database before they can be
used/called from the applications [front end]. Afterwards,
you can open cursors to fetch data using them, you can fetch
row by row and make multiple operations on the currently
active row in the cursor. You should close the cursors and
deallocate them after you are done with the cursors.
This is the Transact-SQL Extended Syntax:
DECLARE cursor_name CURSOR
[LOCAL | GLOBAL]
[FORWARD_ONLY | SCROLL]
[STATIC | KEYSET | DYNAMIC | FAST_FORWARD]
[READ_ONLY | SCROLL_LOCKS | OPTIMISTIC]
[TYPE_WARNING]
FOR select_statement
[FOR UPDATE [OF column_name [,...n]]]
An example cursor:
DECLARE AuthorsCursor CURSOR FOR
SELECT authors.id, au_lname, au_fname
FROM authors
ORDER BY authors.id
Another Example
DECLARE @AuthorID char(11)
DECLARE Cursor1 CURSOR READ_ONLY
FOR
SELECT au_id
FROM authors
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO @AuthorID
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @AuthorID
FETCH NEXT FROM Cursor1
INTO @AuthorID
END
CLOSE Cursor1
DEALLOCATE Cursor1
where
cursor_name: The name of the server side cursor
LOCAL: The cursor will be available only to the batch, stored procedure, or trigger in which the cursor was created
GLOBAL: The cursor is global to the connection
FORWARD_ONLY: The cursor can only fetch data sequentially
STATIC: The cursor will use a temporary copy of the data instead of base tables
KEYSET: Specifies keysets so that the membership and order of rows in the cursor are fixed when the cursor is opened
DYNAMIC: The cursor reflects all data changes made to the base tables as you scroll around the cursor
FAST_FORWARD: The cursor is FORWARD_ONLY and READ_ONLY
READ ONLY: The cursor [data] cannot be updated
SCROLL_LOCKS: The fetched data will be locked
OPTIMISTIC: The cursor does not lock rows as they are read into the cursor
select_statement: The standard select statement, cannot contain COMPUTE, COMPUTE BY, FOR BROWSE, and INTO keywords
UPDATE [OF column_name [,...n]]: Specifies which columns can be updated
SQL Server Hints
SQL Server Hints
----------------
What are hints?
Simply to influence query execution plans to retrieve data
faster.
syscacheobjects
----------------
SQL server after compiling a query creates the execution
plan and keeps it on syscacheobjects object.
You can query the object to see the existing query plans as
follows:
select * from sys.syscacheobjects
where cacheobjtype='compiled plan'
Hints
-----
When you specify hints in your queries the query execution
plans are biased toward your hints rather than using default
strategies.
Types of Hints
--------------
Join Hints: Affects the joins [join types]
Query Hints: Affects overall execution plan
Table: Affects to table access
Join Hints
----------
Loops: Uses nesting for join. Each row of the inner table is
checked for condition satisfaction with the corresponding
outer row.
Hash: One table is treated as a hash table and the other is
scanned one row at a time and the hash function is used to
find equalities
Merge: Each table is sorted first. Then they are compared
row by row [corresponding]
Remote: In this case, at least one table is remote. The join
is done at the right table side.
Example:
select *
-----
----
---
option (merge join)
or
select *
from x
inner merge join y on...
Query Hints:
------------
Specified at the end of the query with keyword 'Option'
select * from x
------------
------------
option (recompile, fast 80)
Hash Group or Order groups: Aggregations (group by, compute, distinct) are done by hashing or ordering
Concat Union, Hash Union, Merge Union: All unions will be done with the specified union type
Loop Join, Hash Join, Merge Join: Joins will be done with the specified join types
FAST number_rows:
Force Order:
MaxDop:
Optimize For:
Recompile:
Keep Plan:
Expand Views:
Table Hints:
------------
Affects how the table will be locked or which index will be
used for the query.
NoEXPAND
INDEX
fastfirstrow
nowait
rowlockpaglock
tablock
nolock
holdlock
ignore_triggers
Plan Guides
-----------
Sometimes queries are originated from a third part application and you have no control to change the query.
You can use plan guides to influence such queries. For example: Using plan guides you can specify the nature of the
select, update, insert queries.
You can use built in stored procedures to create or drop
plan guides.
sp_create_plan_guide
sp_control_plan_guide
Scalability: SQL Server 2005
Scalability: SQL Server 2005
After you have optimized your database design and the application design, you can improve the scalability of SQL Server in two ways
1.Scaling Up
2.Scaling Out
Scaling Up: Improve performance by
1. Adding more processors, memory, and storage capacity
2. Replacing existing hardware with faster hardware
Note: You also need the right version of SQL Server so that scaling up is supported.
How to scale up Processors:
SQL Server is mostly I/O bound not CPU bound, still better/faster CPU usually provides better performance [depends on the application need though].
Processor choices:
32 bit
64 bit x64
64 bit IA64
Hyperthread technology
Multiprocessor based server
Multicore based servers
It is always better to go for 64 bit than 32 bit as it will also support more memory. 64 bit x64 processors provide better performance than ia64 however if your work requires lots of floating point operations and the workload needs > 8 processors you can think about IA64 processors
Multiple processor systems provide better performance than the single processor systems. Windows implements SMP architecture of parallel computing and SQL Server 2005 also takes advantage of it.
Multiple processor systems usually performs better than the equivalent multicore uniprocessor systems.
Intel's hyperthreading technology may increase performance (10-20%) based on the applications (but not always)
Memory Subsystem:
As SQL server is I/O bound using more memory almost always increases performance. Though the amount of memory that we can add depends on the operating systems. 32 bit operating systems supports 4GB RAM max. For 32 bit windows 2 GB is reserved for operating system, in Windows Server 2003 we can specify 1 gb for operating system so getting 3gb for SQL Server.
32 bit SQL Servers can address 4 gb ram, the actual ram amount depends on the operating system used. if 32 bit windows then 2 to 3 gb max as mentioned before
When 64 bit windows and 64 bit sql server then physical memory limits 32TB.
I/O SubSystems:
You can improve disk I/O subsystem by using an appropriate RAID array or SAN (Storage Area Network) solution
Using Network Interface Card (NIC) teaming, you can also improve performance. NIC teaming allows to bond multiple physical NICs into one logical network device. Teaming provides load balance, fault tolerance, and high bandwidth.
Scaling Out will be discussed later.
Scaling Out involves using multiple SQL servers to provide increased scalability.
Scaling Out
Methods
- SQL Server instances
- Clustering
- Database mirroring
- Log shipping
- Replication
- Shared scalable databases
SQL Server instances
In the same computer install multiple instance of SQL Server to take advantage of multiple processors. Each instance may use separate processor with separate lock manager, worker threads, tempdb system database.
Clustering
Failover Clustering can provide robustness and protection against failure. Clustering can provide scaling using spare hardware through passive nodes
Database mirroring
With database mirroring transaction logs are directly sent to the mirror. The mirror is up to date and can be used in case of failure of the principle server.
Log shipping
It depends on the transaction log backups, file copying. The secondary database need not to be updated in real time. The secondary database is read only and hence, be in great use for reporting purposes.
Replication
Merge Replication: Both way, both from and to publishers and subscribers
Transactional Replication: Publisher to the subscriber
Shared scalable databases
Scale out a read-only database only for reporting purposes.
Dynamic Management Views
Dynamic Management Views provide ways to analyze database performance and to find the cause behind bad-performing SQL server databases/instances. Bad performance can be due to: poor design, insufficient memory, poorly configured system, disk bottlenecks, poorly written queries and many others
Previously Windows System Monitor(perfmon.exe) and SQL server profiler were used to find the cause of bad-performance. Now, Dynamic Management Views can help a lot better.
Dynamic Management Views are categorized into 12 categories
- Common language runtime:
- Database: Provides details about database sizes, files used, and partition information.
- Database mirroring related: About mirroring and mirroed databases
- Execution related: Provides insight into query execution statistics.
- Full text search: Provides information on full-text catalogs, indexes, full-text crawl memory buffer
- Input/Output Related: Provides insights into I/O operations, I/O devices, and database file statistics.
- Index related: Database file statistics
- Query notification related: Insights into active query notifications subscriptions in the server
- Replication related: Insights into the workings of replication in the database
- Service Broker Related: Insights into the workings of Service Broker
- Sql server operating system related: Insights into the internal operations of the SQL Server OS.
- Transaction Related: Insights into the operations of active transactions
Performance Data Warehouse: A historical archive of periodic snapshots of the DMV data of interest. How can it help? the problem may occur only at night 3:00 a.m. and when there are not many users. If you want to fix the issue in the morning Performance Data Warehouse will be of great help.
SQL Server Database System Tuning
Database tuning approach
- Proactive approach
- Reactive approach
Best Approach: Use a mix of both.
Be aware (collect information) of the followings that may have significant impact on the performance.
Hardware: Server, Available memory, number of processors, disk subsystems.
Network Infrastructure: Network Cards, Switches, and the rest of your LAN and WAN.
Operating System: Make sure the OpSys is optimally configured for SQL Server.
Database Engine: Know about the SQL Server Architecture and about your operational environments.
Database: Tune database properly
Client Connection: How clients connect to the server
Monitoring and Tuning Hardware: Use Network Monitor Agent, Performance Logs and Alerts, System Monitor, and Task Manager.
System monitor (perfmon.exe) is a great tool for the purpose Using perfmon.exe you can also find where the bottleneck is Processor Subsystem, Memory Subsystem, I/O Subsystem
Monitoring and Tuning SQL Server:
Tools:
- DBCC Commands
- Dynamic Management Views
- SQL Server Profiler
- SQL Server Management Studio
- System Stored Procedures
- SQL Trace
System Monitor: when SQL server is installed many new parameters (performance objects and their counters) are added to System Monitor (perfmon.exe) that help to monitor SQL Server performance.
SQL Server Profiler: This tool is used to capture traces like client activities, stored procedure calls, lock activities. SQL server profiler can help to determine if the client application is at fault.
SQL Trace: create traces using using stored procedures.
Tuning the tempdb system database
- Capacity plan and pre-allocate adequate space
- Separate the log file from the database file
- Use multiple secondary files for the tempDB system database
- Use a faster disk
- Use an appropriate RAID solution.
- Use local disk sybsystem for tempdb.
Tuning Database Layout
Use file and filegroup architecture properly.
File: Use multiple data files for your databases.
Filegroups:
example: create two file groups and place them in different drives. Create tables in one file group and create non-clustered indexes in the other. table I/O and Index I/O are now divided.
SQL Server: Index
SQL Server: Index
- Indexes make database access faster. Without index your queries will run but indexes can increase the performance dramatically.
- You can define multiple indexes for a table and select the index that is required for current operations.
- Only one clustered index is supported for a table
- To get advantage of indexes you have to mention the index related constraints in your where clause like where last_name ='xyz'
- If you have indexes on two or more columns, in where clause mention the first column name/constraint first [that was first in the index creation].
- A clustered index contains table data sorted in the index where a non-clustered index contains reference to the table data or clustered index. Non-clustered indexes are both physically and logically independent of table data.
Index Best Practices
- Select a column/(columns) for indexes that will provide good uniqueness and selectivity otherwise indexing may degrade the performance
- Create indexes that result in lower number of rows to be searched
- Create indexes that select a range of rows
- Try to create clustered index with as uniqueness as possible
- Keep indexes as narrow as possible (not many/unnecessary columns)
- Don't index very small table
In SQL server, you can create indexes through your sql statements or using the development studio interface.
SQL Server : Backups
Why are backups needed?
To protect accidental data loss. Still, if you use highly available and fault-tolerant systems like SAN, and RAID, you may need to backup your database regularly. These fault tolerant systems can not provide accidental data delete by users, accidental data corruption from software and hardware failure. If you have a disaster recovery site or a secondary datacenter to where all data are replicated and data changes are stored, then you may not need backup but otherwise regular backups are always useful.
Data backup can restore data until the last backup. . Transaction Logs may help to restore data up to the point of failure. Restore data from the data backup, then if you have the transaction log up to the failure apply it to the data.
SQL Server Recovery Models:
- Simple Recovery Model. Transaction logs are not used
- Full Recovery Model: Use both data and transaction log backups
- Bulk-logged recovery model: Like full recovery model. But here bulk operations are minimally logged to the transaction log. So full recovery becomes faster.
Types of Backups
- Full database backup: Full database, all files and file groups are backed up
- Partial Back up: It's not the differential backup. Here, all primary file groups and any other file groups that are read-write are backed up. But read-only file groups are not backed up by default.
- File or file group backup: Usefull when full back ups take too long. Backup in parts.
- Differential back up: Backup only the changes. Should be applied to a full backup.
- Log backup: In log back ups, transaction logs are backed up regularly. It helps in bulk logged recovery model and full recovery model.
- Copy only backups:
- Full text catalog backups
Knowing about backup devices, media sets, and backup history tables is also important.
Mirrored Media Sets: Backup data to more than one media at the same time.
Backup Strategy:
Based on your company need and type of data, you should come up with a backup strategy.
Some backup strategies
Strategy - 1
- Take full database backup every saturday night
- Take a differential backup every wednesday night
- Perform, continual transaction log backups every 30 minutes
Strategy - 2: If the database is too big, and user activities need continuous uptime and optimal performance
- Filegroup 1 backup every saturday night
- Filegroup 2 backup every sunday night
- File group 1 differential backup every Tuesday night
- File group 2 differential backup every wednesday night
- File group 1 differential backup every Thursday night
- File group 2 differential backup every Friday night
- Perform, continual transaction log backups every 30 minutes
Backup System Databases
It is also important to back up system databases like master, model, msdb, tempdb, distribution, resource.
Oracle background processes
Oracle Process architecture
Oracle processes running simultaneously in the background
- archiver processes (ARCn)
- checkpoint process (CKPT)
- database writer processes (DBWn)
- dispatcher processes (Dnnn): multiplex server-processes on behalf of users
- memory-manager process (MMAN): used for internal database tasks such as Automatic Shared Memory Management
- job-queue processes (CJQn)
- log-writer process (LGWR)
- logical standby coordinator process (LSP0): controls Data Guard log-application
- media-recovery process (MRP): detached recovery-server process
- memory-monitor process (MMON)
- memory-monitor light process (MMNL): gathers and stores Automatic Workload Repository (AWR) data
- process-monitor process (PMON)
- process-spawner (PSP0): spawns Oracle processes
- queue-monitor processes (QMNn)
- recoverer process (RECO)
- remote file-server process (RFS)
- shared server processes (Snnn): serve client-requests
- system monitor process (SMON)
Oracle Database Objects
Oracle Database Objects
Schema Objects
- Clusters
- Constraints
- Database links
- Database triggers
- Dimensions
- External procedure libraries
- Index-organized tables
- Indexes
- Indextypes
- Java classes, Java resources, Java sources
- Materialized views
- Materialized view logs
- Object tables
- Object types
- Object views
- Operators
- Packages
- Sequences
- Stored functions, stored procedures
- Synonyms
- Tables
- Views
Nonschema Objects
- Contexts
- Directories
- Parameter files (PFILEs) and server parameter files (SPFILEs)
- Profiles
- Roles
- Rollback segments
- Tablespaces
- Users
What are schema objects?
A collection of logical structures of data, or schema objects. In Oracle, each user owns a single schema and the schema is named after the user.
Non-schema Objects:
Other types of objects stored in the database but not part of any schema.
Both can be manipulated using SQL
Oracle Certifications Overview
Oracle: Application Developer : Certification : Topics
Oracle 11g: Basic PL/SQL
- Database objects:
- table
- view
- sequence
- Index
- synonym
- stored procedure
Will be continued
IBM DB 2: free version of IBM DB2
DB2 Express-C is the free edition of DB2 for Linux and Windows. It is free to download, develop applications, deploy into production, and even redistribute. DB2 Express-C supports development of database applications using XML, C/C++, .NET, JDBC, ODBC, PHP, Ruby, and more. It is optimized for systems with up to 2 processors and 4 GB of memory, and does not impose artificial restrictions on database size, number of databases, or number of users.
Random DB2 Stuffs
IBM-DB2: Random stuffs.
This is a great book on IBM-DB2. If you are experienced with DBMS systems like SQL Server, Oracle, MySQL, the first thing you may want to look at the features that are supported by IBM-DB2. Then you can try to find out the differences among them. It really will make your learning much faster. Then just skim through page by page, when you are at the end of the book, you will find you are ok to work with IBM-DB2.
http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg244249.html
PHP has concepts like PDO and PDO_ODBC, and generic odbc features that will allow you to connect to and work with DB-2 databases. Generic odbc is the general/procedural way to connect to. PDO and PDO_ODBC are object oriented ways. PDO works with native IBM drivers. PDO_ODBC uses odbc connections to the DB2. For PHP you need to install drivers to support DB2. I worked with PDO_ODBC to connect to DB-2 databases in IBM iSeries servers. You need to install odbc db2 drivers for iseries servers. Then in /etc/odbcinst.ini file you need to mention driver specifications and in /etc/odbc.ini, you need to mention the odbc dsn name, database name, and some other parameters to connect to the database. Though, you may also supply some parameters in the PDO_ODBC connect method.
For details on PHP and DB2, please check:
http://www.redbooks.ibm.com/redbooks/pdfs/sg247218.pdf
Crystal Report for Java/JSP
Similar to the way you use crystal report for report creation in Visual Studio. The following web-pages and videos demonstrate how to use crystal reports in eclipse IDE [for Java] for report creation. Creating reports using tools like this may be faster and more efficient than creating reports with Java/JSP coding solely. Modification to existing reports also becomes much more easy. Just like to change a column title or to change the font size, you will not require to dig into codes
Crystal report for Java can be used with other environments like
IBM Rational Application Developer
IBM WebSphere Studio
BEA WebLogic Workshop
Borland JBuilder
For details on how to, please check:
Crystal Report for Java: Environments
FireBird GUI Tools
GUI Tools for Firebird DB management
------------------------------------
Commercial
These are very good tools
IBExpert - Developer Studio
Interbase firebird development studio - develop/design
EMS sql mgmt studion for interbase/firebirdails.html -- http://www.soft32download.com/software/EMS-SQL-Manager-2005-for-InterBase-Firebird-download-details.html
xstarter FBIB - good management tools--scheduling of works -- http://www.xstarter.com/firebird-interbase-job-scheduler.html
Free
----
Development and Administration Tools
DbVisualizer Free
JDBStudio
ibview
ibexpert free personal edition limited
IBSQL not great but have some good features
Crypton firebird database manager -- very good
administration tool
IBUtils--design tools
DBManager Professional--good tool design/develop/admin tool
FIBS Firebird -- InterBase Backup Scheduler
IBEasy+ -- not much useful
Recommendation:
------------------
Crypton firebd db manager ---- very good administration
tool -- http://www.crypton.co.uk/freetools.html
DBManager Professional ---- good tool design/develop/admin tool
http://www.dbtools.com.br/EN/downloads/downloads_file.php?file_id=11
http://www.download.com/DBManager-Standard-Edition/3000-2065-10580436.html?part=dl-DBManager&subj=dl&tag=button
FIBS Firebird-InterBase Backup Scheduler -- http://www.talatdogan.com/frees.htm
Convert database from one format to another
EMS SQL Manager for InterBase/Firebird
It is very good product for developer. Seems also has export/import capabilities.
http://www.sqlmanager.net/products/ibfb/manager
Migrating Interbase to Firebird: Front end Java
http://www.topicscape.com/InterBase_Firebird_migration/IB2FB.php
Convert db from one to another: A great tool
http://www.filedudes.com/Database_Viewer_Editor-download-39992.html
http://www.yankeedownload.com/free-download/9579/database-viewer-editor.html
Database Converters
http://www.newfreedownloads.com/find/interbase.html
Convert Master
http://www.torry.net/authorsmore.php?id=3274
Firebird Stored procedure delphi
In firebird, the syntax to create TRIGGERs and Stored Procedures is as below:
---
SET TERM !! ;
CREATE TRIGGER auto_update_1 FOR DBPATHS
after INSERT POSITION 0
AS
BEGIN
insert into commission values (100,100);
END !!
SET TERM ; !!
---
SET TERM !! ;
CREATE PROCEDURE CountTo10
RETURNS (Cnt INTEGER)
AS
BEGIN
Cnt = 1;
WHILE (Cnt <= 10) DO
Cnt = Cnt + 1;
END !!
SET TERM ; !!
----
An example how to use stored procedure from delphi is as follows:
procedure TForm1.Button1Click(Sender: TObject);
var
result : integer;
begin
if ibdatabase1.Connected = false then
begin
ibdatabase1.Connected := true;
end;
with IBStoredProc1 do
execproc;
IBStoredProc1.Transaction.Commit;
result := IBStoredProc1.Params.ParamValues['Cnt'];
showmessage(inttostr(result));
end;
SAP: Resources
SAP: Career Prospects
SAP JCo Connector
Developing J2ee Applications for SAP
SAP Overview
- Introduction to ERP and SAP
- Enterprise Resource Planning (ERP) - helps to
manage and operate business processes
effectively and in an integrated/co-ordinated
fashion. The integration nature helps businesses
to understand their business/business processes more effectively
and make decision intelligently
- ERP makes use of IT to integrate all business
processes into a single co-ordinated system
- In ERP only one integrated database is used that is shared
among all the business departments. Operational
data from all business departments such as
production, sales, finance, hr, customer
service, SCM come into the same database. Hence, data are more consistent and complete. The database may not be in one single physical location, but can be distributed and integrated database
- As you get all the information in the same place, generating cross-department reports, applying BI to these data become easier and more meaningful. Getting overall picture of the company becomes much easier (than multiple DBMS). It helps the management take important decisions and plan the company future more effectively
- If all the departments were using separate databases for their operations, it would be harder to generate cross-functional reports, and get overall picture of the company. Merging different databases also would become a big headache.
- Additionally, Many jobs require inter-department communications. ERP makes such communication easier/faster, more manageable
- If all the departments were using separate databases/systems, interdependent activities would require to insert data (relevant to the dept) in all the databases separately - more work, require more time, more error prone. If a change is needed, the change needs to be done in multiple places - again the three mores. The departments would require to submit individual reports to the management - more difficult for the management to understand the areas of the interdependent activities
- To address such issues ERP came into play
- Evolution: MRP, MRP II, ERP
- MRP: Materials Requirements Planning: 1960: To control inventory management. Overstock or lower-stock inventory control
- MRP II: Manufacturing Resource Planning: 1970s: for manufacturing companies to automate production, sales, marketing, HR, Finance: focused manufacturing businesses only.
- ERP: 1990s: For standard business organizations addressing any type of business.
- SAP
- SAP: Systems, Applications Products: Was established in 1972
- Versions of SAP: R/2 (late 1970s), R/3 (early 1990), mySAP ERP (late 1990), standard business products, industry specific products, for small and mid-size enterprise solutions
- R/2 for mainframe platform with character based clients, R/3 improved R/2 to support diversified and changing technologies, mySAP ERP further improved R/3 to support ebusiness
- mySAP ERP: has many modules such as Finance, HR for specific departments, also integrates related modules
- mySAP ERP features: process auditing, centralized system management, centralized operation management, integrated and modular model, integration with non-SAP systems with netweaver platform, e-business support
- mySAP ERP solutions: CRM (Customer Relation Management), ERP (Enterprise Resource Planning), SCM (Supply Chain Mgmt), SRM (Supplier Relationship Management), PLM (Product Lifecycle Management)
- SAP solutions for small and mid size enterprises (SMEs): SAP Business One, mySAP All-in-One
- Industry specific solutions: Aerospace and Defense, Oil and Gas, Banking, Media
- mySAP:
- R/3 introduced GUI for SAP. Latest version for SAP R/3 is the Enterprise Central Component (ECC) = core component of mySAP ERP.
- mySAP: Three tier architecture: presentation, application, data. Presentation: GUI, application: processes, Data: storage and manipulation of data
- SAP tasks: functional, technical
- Functional: end user carry out an operation
- Technical: Administrators and programmers - update SAP system
- each task is considered as a transaction, each transaction is associated with a transaction code
- ME21 - functional - purchase order, SE38 - technical - work with ABAP (Programming Language for SAP) editor
- SAP Data: Organizational Structure, Master Data : needs to be created before a SAP systems come into operation.
- Organizational Structure - structure of the organization - people - departments. Master: example: a list of suppliers - change more frequently than organizational structure
- Reporting in SAP R/3: creating report programs (create with ABAP), generating report lists - programmed report programs
How to call SAP Function Module/RFC from Java
SAP Web Shop Resources