In SQL Server architecture, the term "database server" refers to the operating system process on Windows that represents the active, programmed side of the database system. All management of the database system is performed through and by the server, Under Windows, a program can be started as a service. A service runs in the background and has nographical user interface. Microsoft SQL Server has the following services:
y y y y y
y y y
MSSQLServer: Database server SQL Server Active Directory Helper: Enables integration with Active Directories. SQLServerAgent: Automatic task execution and SQL Server alert handling SQL Server Browser: Provides SQL Server connection information to client computers. SQL Server FullText Search (MSSQLSERVER):Quickly creates full-text indexes on content and properties of structured and semi-structured data to allow fast linguistic searches on this data. SQL Server VSS Writer: Provides the interface to backup/restore Microsoft SQL server through the Windows VSS infrastructure. Distributed Transaction Coordinator: distributes transactions over multiple SQL Servers MSSQLServerAdHelper: Adds and removes objects used to register instances of SQL Server and ensures that the Windows account under which an SQL Server service is running has permissions to update all of the Active Directory objects for the instance, as well as any replication publications and databases for that instance.
The SQLServerAgent service must be started automatically together with MSSQLServer SQL Server can be started and stopped manually using the command prompt by typing: net start mssqlserver or sqlservr, or net start SQLServerAgent or by running SQLSERVR.EXE. By specifying the ±T, SQL Server should be started with a specified trace flag in effect. SQL Server supports multiple instances of the SQL Server database engine running concurrently on the same computer. Each instance of the SQL Server database engine has its own set of system and user databases that are not shared between instances. There are two types of instances of SQL Server: Default Instances: The default instance is identified by the name of the computer on which the instance is running. When a client program specifies only the computer name in its request to connect to SQL Server, a connection to the default instance of the database engine on that computer is established. There can only be one default instance on any computer, the default instance can be of any version of SQL Server.
Named Instances: All instances of the database engine other than the default instance are identified by an instance name specified during installation of the instance. Clients must provide both the computer name and the instance name of any named instance to which they are attempting to connect e.g. computer_name\instance_name.
A database server is an application that runs as a system process on a computer. Client tools communicate with the database server through network protocols such as TCP/IP or NW Link IPX/SPX Compatible Transport, the native protocol of Novell NetWare networks. The network protocol used by the database server determines the method used for interprocess communication, for example, named pipes or TCP/IP sockets. These methods are installed as net libraries (DLLs). Only one net library is active on a client. Several application programming interfaces (APIs) are available. For example, OLE DB and DBLib. Other client applications can use the ODBC interface.
A process is a program that is currently being executed. Under Windows, to keep processes separate from each other and prevent interference, each process has its own address space. The process is also the owner of all the program resources, such as file handles and access tokens. Windows 2000 allows a process to manage a 4GB range of linear addresses. An application process is able to control the lower portion of this space (either 2GB or 3GB). The upper portion is managed by system code for that particular process. The basic unit of activity in Windows is the thread. Each process has at least one thread, and may have multiple threads. Windows assigns time slices on physical processors to threads. All threads can run concurrently. Since there is no address space switch involved, switching between threads within the same process is much faster than a conventional process switch in operating systems that run without the thread concept. SQL Server has an internal layer that implements an environment similar to an operating system for scheduling and synchronizing concurrent tasks without having to call the Windows kernel. This internal layer can schedule fibers as effectively as it works with threads. A fiber is a unit of execution. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. The server configuration lightweight pooling parameter controls whether SQL Server uses threads or fibers. The default is 0, in which case SQL Server schedules a thread per concurrent user command. If lightweight pooling is set to 1, then SQL Server uses fibers instead of threads.
SQL Server runs in an operating system process called SQLSERVR. This operating system process contains threads for the operating system and threads for clients logged on to the server. Every connection between a client and SQL Server uses one of these threads. SQL Server maintains a pool of either threads or fibers for user connections. The maximum size of this pool is controlled by the max worker threads server configuration parameter. One thread can serve several connections. Special Threads: The Lazy Writer thread scans the data cache to write dirty pages to disk. Dirty pages are data pages that have been entered into the buffer cache and modified, but not yet written to disk. The Lazy Writer thread sleeps for an interval of time. When it is restarted, it checks the size of the free buffer list. If the free buffer list is below a certain point, the Lazy Writer thread scans the buffer cache to write dirty pages to disk. Most of the work writing dirty pages is done by the user threads and the Lazy Writer thread typically finds little to do. The Lock Manager dynamically adjusts the resources it uses for larger databases, eliminating the need to adjust the locks server configuration parameter manually. The Log Writer records are written asynchronously by a log writer thread, except when: y A commit forces all pending records for a transaction to disk y A checkpoint forces all pending records for all transactions to disk
The Checkpoint thread scans the buffer cache for dirty pages and writes to disk any buffer page that are marked as dirty. Checkpoints typically find few dirty pages to write to disk because most dirty pages get written to disk by the worker threads or Lazy Writer thread in the period between two checkpoints. The Background task checks every 30 minutes if the database or the transaction log files can be shrunk, in case the database option autoshrink is set. It also starts every 5 seconds and checks on 20 data pages if they contain ghost records. These are records which are deleted logically but not physically. SQL Server must verify that the login id supplied on each connection request is authorized to access the database server. This process is called authentication. SQL Server supports three security modes for authentication: SQL Server security: a connection to SQL Server is established through a SQL Server login and password, e.g. using login id sa Trusted security: a connection to SQL Server is established using the Windows user account. When a user connects to SQL Server through a Windows user account, SQL Server verifies that the account name and password were validated when the user logged on to the operating system. The SQL Server client then requests a trusted connection to SQL Server. The properties of a trusted connection include the Windows group and user accounts of the client that opened the connection. A member of the SQL Server sysadmin fixed server role, for example login id sa, must first specify to SQL Server all the Windows accounts or groups that can connect to SQL Server. Mixed security: Mixed security supports both, SQL Server security and trusted security. The system databases are: master holds all the system level information for a SQL Server system. It stores all login accounts and all system configuration parameters. model is used as the template for all databases created on SQL Server. msdb is used by SQL Server Agent for scheduling alerts and jobs. tempdb holds all temporary tables and temporary stored procedures for all users connected to SQL Server. It also fills any other temporary storage needs such as work tables generated by SQL Server. Every time SQL Server is started, tempdb is re-initialized. The initialization is recorded as ³clearing tempdb´ in the error log. Northwind and pubs are sample databases that are provided as learning tools. SQL Server maps a database over at least two operating system files. Data and log information are never contained in the same file, and individual files are used only by one database. The primary file (.mdf) contains the startup information and system tables for the database and is used to store data. Every database has one primary file. Secondary files (.ndf) hold data that does not fit in the primary data file. Databases do not require any secondary data files if the primary file is large enough to hold all the data in the database. Transaction log files (.ldf) contain the log information used to recover the database. There must be at least one log file for each database.
Data files are assigned to one filegroup. A single table can use space in several different files within the same filegroup. The PRIMARY filegroup is created when SQL Server is installed. Database files only grow automatically in a filegroup if the option autogrow is set and no further space is available on any of the database files in the filegroup. Filegroups use a proportional fill strategy across all the files within each filegroup. As data is written to the filegroup, SQL Server writes an amount proportional to the free space in the file to each file within the filegroup, rather than writing all the data to the first file until full and then writing to the next file. For example, if File1 has 100 MB free and File2 has 200 MB free, one extent is allocated from File1, two extents from File2, and so on. This way both files become full at about the same time and simple striping is achieved. Transaction log files, however, cannot be part of a filegroup; they are separate from one another. As the transaction log grows, the first log file fills, then the second, and so on, using a fill-and-go strategy rather than a proportional fill strategy. Therefore, when a log file is added, it may not be used by the transaction log until the other files have been filled first.
The page is the unit of data storage in SQL Server. Pages are 8 KB in size. SQL Server allocates pages to objects and reuses space freed up by deleted rows. These operations are internal to the system and use data structures not visible to users. Extents are the basic unit in which space is allocated to tables and indexes. An extent consists of 8 contiguous pages, or 64KB. A new table or index allocates first pages from mixed extents. That means extents can contain pages from different objects. Extents are called uniform, when a table or index allocates all eight pages. Log files do not contain pages, they contain a series of log records, allocated on virtual log files.
File Header is a special page that contains information about the file. Page Free Space (PFS) pages record whether an individual page has been allocated, and the amount of space free on each page. Each PFS page covers 8000 pages. For each page, the PFS
has a bitmap recording whether the page is empty, 1-50% full, 51-80% full, 81-95% full, or 96100% full. Once an extent has been allocated to an object, SQL Server uses the PFS pages to record which pages in the extent are allocated or free, and how much free space is available for use. This information is used for allocating a new page or finding a page with free space available for a newly inserted row. Global Allocation Map (GAM) pages record which extents have been allocated. Each GAM covers 64000 extents, or nearly 4 GB of data. The GAM has one bit for each extent it covers. If the bit is 1, the extent is free; if the bit is 0, the extent is allocated. Shared Global Allocation Map (SGAM) pages record which extents are currently used as mixed extents and have at least one unused page. Each SGAM covers 64000 extents, or nearly 4 GB of data. The SGAM has one bit for each extent it covers. If the bit is 1, the extent is being used as a mixed extent and has free pages; if the bit is 0, the extent is not being used as a mixed extent, or it is a mixed extent whose pages are all in use. Index Allocation Map (IAM) pages map the extents in a database file used by a heap or index. Each heap or index has one or more IAM pages recording all the extents allocated to the object. A heap or index has at least one IAM for each file on which it has extents. A heap or index may have more than one IAM for a file if the range of the extents for the heap or index on that file exceeds the range that an IAM can record. Bulk Changed Map (BCM) pages record the extents which have been changed by bulk operations such as SELECT INTO and CREATE INDEX since the last transaction log backup. If the bit is 1, the extent has been changed by a bulk operation, if the bit is 0, the extent has not been changed. BCM pages are only relevant when the recovery model of the database is set to bulk_logged. See also chapter 5 Database Recovery. Differential Changed Map (DCM) pages record which extent has changed since the last execution of BACKUP DATABASE. Each DCM covers 64000 extents, or nearly 4 GB of data. If the bit for one extent is 1, the extent has been changed since the last BACKUP DATABASE, if the bit is 0, the extent has not been changed. Read more about the usage of the DCM in Unit 4: Database Backup. The Boot page is the ninth page in the database file, that is, the first page of the second extent. It is stored in the primary database file and in the first transaction log file. The boot page contains attributes of the database. It records for example attributes which are needed for an automatic recovery The information returned by transact-SQL command DBCC SQLPERF(LOGSPACE) can be used to monitor the amount of log space used and indicates when to back up the transaction log. System stored procedure sp_spaceused computes the amount of disk space used for data and indexes. When updateusage is specified, SQL Server scans the data pages in the database and makes any necessary corrections to the sysindexes table regarding the storage space used by each table. There are some situations, for example, after an index is dropped, when the sysindexes information for the table may not be current. This process can take some time to run on large tables or databases. The command DBCC UPDATEUSAGE can be run separately.
An SQL Server database consists of tables that contain data and other objects such as views, indexes, and stored procedures defined to support the activities performed on the data. These database objects are stored on pages in physical memory, along with other data objects defined in the system tables. Indexes are used to speed up searching for records in tables. Indexes can be created for a frequently used search field or a combination of search fields. SQL Server has two types of indexes: Non-clustered index Clustered index
A non-clustered index is a B tree, which is searched from the highest level to the data pages. A search through the index requires the following page accesses: One page for each index level, and one access on each data page(s). This graphic displays data and index pages in a table. The search fields are named for City and Country and contain a non-clustered index in the field City. With a non-clustered index, index and data pages are divided, so that you can create as many non-clustered indexes for a table as needed.
The location of a data row is contained in each leaf level of the non-clustered index and is identified by a record identifier (RID) comprised of the file number, page number, and slot number of the row.
A clustered index dictates the physical storage order of the data in the table. That means a table can contain only one clustered index. All inserts made fit in the ordering sequence of the clustered index key. A clustered index is organized as a B tree. Each page in a clustered index holds a page header followed by index rows. Each clustered index row contains a key value and a pointer to either a page or a data row. Each page in a clustered index is called an index node. The top node of the B tree is called the root node. The bottom nodes in the clustered index are called the leaf nodes. In a clustered index, the data pages make up the leaf nodes. Index levels between the root and the leaves are known as intermediate levels. For a clustered index, root points to the top of the clustered index. SQL Server navigates down the clustered index to find the row corresponding to a clustered index key. To find a range of keys, SQL Server navigates through the clustered index to find the starting key value in the range, and then scans through the data pages. To find the first page in the chain of data pages, SQL Server follows the leftmost pointers from the leaf node of the clustered index.
If a table has a clustered index and a clustering key, the leaf nodes of all non-clustered indexes use the clustering key as the row locator rather than the physical record identifier (RID). If a table does not have a clustered index, non-clustered indexes continue to use the RID to point to the data pages. In both cases, the row locator is stable. When a leaf node of a clustered index is split (data page), the non-clustered indexes do not need to be updated because the row locators are still valid. If a table does not have a clustered index, page splits do not occur.
A database transaction is a series of SQL commands that are completed, or not executed at all. A transaction is started with the command BEGIN TRAN. All commands within one transaction are handled as one atomic command by SQL Server . Each SQL Server database has a transaction log that records data modifications made in the database. The log records the start and end of every transaction and associates each modification with a transaction. SQL Server stores the information in the log to either redo (roll forward) or undo (roll back) the data modifications that make up a transaction. Each record in the log is identified by a unique log sequence number (LSN). A transaction is ended with the command COMMIT TRAN or ROLLBACK TRAN. SQL Server then writes all logging information from this transaction to the log file and sends a message to the application program confirming the COMMIT TRAN. This ensures that all confirmed changes from SQL Server are logged onto a physical disk. Records to be changed are first stored in a cache. Sometimes the records to be changed are only up to date in the cache During a checkpoint, all dirty data and log pages in the cache are written to the files. There are 2 kinds of checkpoints: Manual checkpoint (transact-SQL command Checkpoint) Automatic checkpoint SQL Server configuration parameter recovery interval controls when SQL Server issues a checkpoint in each database. Checkpoints are done on a per database basis. The recovery interval sets the maximum number of minutes per database that SQL Server needs to recover databases. The default is 0, indicating automatic configuration by SQL Server. This means a recovery time of less than one minute and a checkpoint approximately every one minute for each database. When a database is set to use the simple recovery model, logging information from the log is erased at each automatic checkpoint. This option is useful for databases that require many changes, and
have minimum security requirements, for example database msdb and tempdb. If this mode is set, the restore cannot apply the transaction logs written later than the checkpoint. Perform a full backup immediately after disabling this option. During a checkpoint, the log sequence number (LSN) of the first log record at which a system-wide recovery must start is written to the database boot page. This LSN is called the Minimum Recovery LSN (MinLSN) and is the lowest of the three following values: The LSN of the checkpoint The LSN of the oldest recorded dirty data page The LSN of the start of the oldest active transaction The portion of the log file from the MinLSN to the end of the log is called the active portion of the log. This is the portion of the log required for a full recovery of the database. No part of the active log can ever be truncated. All log truncation must be done from the parts of the log before the MinLSN.
An RDBMS processes a large number of transactions simultaneously. A lock synchronizes simultaneous accesses to an object. When a transaction accesses an object, the object is temporarily locked to prevent other transactions from accessing it simultaneously. The type of lock determines which operations from other transactions can be executed on the locked object. Types of locks are: Shared (S) Used for operations that do not change or update data (read-only operations), such as a SELECT
statement. Exclusive (X) Used for data-modification operations, such as UPDATE, INSERT, or DELETE. This type of lock ensures that multiple updates cannot be made to the same resource at the same time. Update (U) Used on resources that can be updated. This type of lock prevents a common form of deadlock that occurs when multiple sessions are reading, locking, and potentially updating resources later. Intent (I) Used to establish a lock hierarchy. Schema (Sch) Used when an operation dependent on the schema of a table is being executed. The two types of schema lock are: Schema stability (Sch-S), Schema modification (Sch-M)
SQL Server has multi-granular locking that allows different types of resources to be locked by a transaction. These resources are locked at a level appropriate to the task by using a dynamic locking strategy to determine the most cost-effective locks. SQL Server can lock the following resources, which are listed in order of increasing granularity: RID: Row identifier, used to individually lock a single row within a table KEY: Row lock within an index, used to protect key ranges in serializable transactions PAG: 8KB data or index page EXT: Extent, contiguous group of eight data or index pages TAB: Entire table, including all data and indexes DB: Database The finer the lock granularity, the more locks are needed. For example, when accessing a table with 100,000 pages, you can use 100,000 page locks or only one table lock. More locks require more
administration time. If the lock granularity is coarse, other transactions must wait longer until the lock is released. To display locks currently in use, use the stored procedure sp_lock or use the Current Activity View in the Enterprise Manager. SQL Server may dynamically escalate or deescalate the granularity or type of locks. Lock escalation is the process of converting many fine-grain locks into fewer coarse-grain locks, thus reducing system overhead. In this example, a transaction requests rows from a table for update purpose. SQL Server automatically acquires locks on those rows affected (1) and places higher level intent locks on the pages (2) or index, that contain those rows. The table which contains the rows also receives an intent exclusive lock (3). When the number of locks held by the transaction exceeds a threshold, SQL Server attempts to change the intent lock on the page to a stronger lock, e.g. an intent exclusive would change to an exclusive lock (4). After acquiring the stronger lock, all row level locks held by the transaction on the page are released, reducing lock overhead (5). SQL Server may choose to use both row and page locking for the same query, for example, by placing page locks on the index (if enough contiguous keys in a non-clustered index node are selected to satisfy the query) and row locks on the data. This reduces the likelihood of lock escalation. SQL Server rarely needs to escalate locks; the query optimizer usually chooses the correct lock granularity at the time the execution plan is compiled. Lock escalation thresholds are determined dynamically by SQL Server and do not require configuration. Several users running concurrent transactions can cause inconsistencies in the data read by other users. The following situations may occur: Dirty read: Transaction T1 updates data X. Another transaction T2 then selects X before T1 performs a COMMIT. T1 then performs a ROLLBACK. So T2 has read a value for X that never existed in the database as consistent (committed) data. Non-repeatable read: Transaction T1 selects data X, Y. Another transaction T2 then updates X and deletes Y and commits. T1 then selects X, Y again. It reads a modified value for X and discovers that Y does not exist. Phantom data: Transaction T1 selects all the data that satisfies the condition < 10. Only X is returned. Transaction T2 then creates data Z and updates Y so as to satisfy the condition < 10. T2 commits. T1 then again selects all the data < 10. Now X, Y, and Z are returned. So new data appeared (phantom data). The isolation level determines to what degree one transaction is isolated from other transactions. A lower isolation level increases concurrency but at the expense of data correctness. A higher isolation level ensures that data is correct, but can negatively affect concurrency. The isolation level set by an application determines the locking behavior used by SQL Server. To define the isolation level for a connection to the database, use the transact-SQL command SET TRANSACTION ISOLATION LEVEL. SQL-92 defines four isolation levels, all of which are supported by SQL Server: Read uncommitted accepts all dirty reads, non-repeatable reads and phantoms. No shared locks are issued and no exclusive locks are honored. Read committed avoids dirty reads. To achieve this shared locks are held while data is being read. But the data can be changed before the end of the transaction, resulting in non-repeatable reads or phantom data. This option is the SQL Server default. Repeatable read avoids both dirty and non-repeatable reads. It sets locks on all data used in a query and prevents other users from updating the data. Phantom data can occur. Serializable avoids all dirty reads, non-repeatable reads and phantoms. It sets a range lock on the data range selected. Other users cannot update or insert rows into that data range until the transaction is complete.
The database access agent in the work processes handles database requests, and consists of several subcomponents. One of these subcomponent is the database vendor-independent R/3 database interface (R/3 DB IF), which handles accesses to table and dictionary buffers. The database access agent also provides the Database SQL Library Interface (DBSL IF), which is database vendor-specific. All components on the SAP System side of this interface are independent of the database used. On the database side of the interface, only system components provided by the database vendor are used. With SQL Server, components on the database side of the DBSL IF are implemented using a Microsoft product. The main task of the DBSL IF is the mapping of ABAP Open SQL statements to the database vendor-specific SQL language
Each SAP work process is connected to the database server through several connections that are used for executing database commands. The DBSL IF is implemented using the Microsoft OLE DB. If the SAP work processes have to connect to a computer other than the database server, the communication protocol will be TCP/IP, otherwise Named Pipes is chosen. Open Data Services (ODS) is the component that manages all packets coming from the server net libraries, for example TCP/IP sockets. The database server processes all requests passed to it by ODS and returns the results to the client.
Several connections are used for each SAP work process, with the following isolation levels: Connection 0: Committed read connection, used for consistent transactions involving inserts, deletes, updates, and select for update or database cursor usage. Connection 1: Uncommitted read connection, used for all DDL transactions, creating stored procedures, and to execute single selects and dirty read cursors. Connection 2...N: Uncommitted read connection, used for dirty read selects. A maximum number of 40 connections are established for each work process, other than connections 0 and connection 1. These connections are opened as needed but only closed when shutting down the instance or restarting the work process. If the select is nested in too many surrounding selects (more than 40), cursors are used in connection 1. Each connection to the database uses approximately 50KB of memory. To call the SAP Process Monitor, choose Tools Administration Monitor Performance (ST04) Activity. Choose button Detail analysis menu then button SQL processes. By default, the output is sorted by the CPU time and only connections used by the SAP System are shown. To display the SQL processes sorted by the host process id (pid), choose button Group by/Raw Display or F8. The Appl. Server column displays the host name of the database client. For all connections established by the SAP System, this is the name of the SAP application server to which the SAP user is connected. The host pid shows the process id of the SAP work process. Each SAP work process opens a number of database connections, each of which is identified by its SQL Server process id (spid) and labeled by the Application program name R3<T><nn>(<mm>)<type>:
<T>: Work process identifier (D: Dialog, B: Background, S: Spool, U: Update, E: Enqueue, 2: Update2, µ µ:external tool such as saplicense or tp) <nn>: Work process number <mm>: Number of connection <type>: Connection context controlled by the DBIF - comm rd: Committed read - sp create, single select: DDL, creation of stored procedures and single selects - unc rd: Dirty read selects You can also display the SQL processes by choosing Management Current monitor Process Info. The Process Info view displays all SQL processes sorted by their login id. A colored globe indicates an active process. All inactive processes are marked by gray globes. The information displayed is read from the table sysprocesses, which contains information about the client and system processes running on SQL Server. Table sysprocesses is logically found in database master. It is built dynamically each time it is read, and therefore does not exist physically
In ABAP, you access the database by using Open SQL commands. The ABAP program and its Open SQL commands are independent of the database system. An Open SQL command is converted into a standard form and is passed to the Database Access Agent. The Database Access Agent checks whether the accessed table is buffered in an SAP table buffer. If the table is buffered, the data is retrieved from the SAP buffers and results are supplied without accessing the database.
If the requested data is not found in the SAP buffers, the DBSL IF translates the Open SQL command to one or more stored procedures or to a direct SQL statement: Stored procedures are reusable collections of SQL commands, compiled in a single execution plan. Permanent stored procedures are stored in the SAP database and are not deleted after restarting SQL Server or the SAP instance. The DBSL IF creates a unique stored procedure name for each Open SQL command. Direct statements come from dynamic Open SQL statements, for example FOR ALL ENTRIES. Direct SQL statements are created directly on the SAP database without creating a stored procedure in prior. When an Open SQL statement is send to the database access agent, the operation is called the PREPARE operation. The following steps are performed: The statement is analyzed by the database access agent and its origin is classified. The open SQL statement is converted into a native SQL statement. For permanent stored procedures a unique stored procedure name is created by the DBSL agent. A set of parameters is generated from the WHERE condition for the statement text. A stored procedure text is generated consisting of the stored procedure name, the parameters and the text. The stored procedure is created on the database. For direct SQL statements the native SQL statement is executed on the database and cached in the procedure cache When a permanent stored procedure is executed, the following operations are performed: The DBSL agent passes the command to execute a stored procedure including the parameters to SQL Server. The SQL statements in the stored procedure are compiled and optimized and an execution plan is
created. If the stored procedure already exists and an execution plan is still in procedure cache, the execution plan is reused. SQL Server executes the stored procedure. Direct SQL statements are executed using system stored procedure sp_execute in the OLE DB interface. This stored procedure executes a transact-SQL statement that can be reused many times, or that has been built dynamically. The transact-SQL statement can contain embedded parameters. The execution of the stored procedure or the direct SQL statement starts with the OPEN operation. The group of records returned when a stored procedure or a direct SQL statement is executed is called a result set. While the stored procedure or direct SQL statement is executed, the result set is transferred from SQL Server to the requesting SAP work process. This is called the FETCH operation.
Before creating a permanent stored procedure, the DBSL IF checks if the unique name already exists in a stored procedure name cache. If a stored procedure name is not found in the stored procedure name cache, the DBSL IF checks whether it exists in the sysobjects table. If it is not found in the sysobjects table it is created on the database before it is executed. If the stored procedure is found in the name cache, it is executed directly. The names of permanent stored procedures are stored in a cache defined by the SAP instance profile parameter dbs/mss/pn_cache_size. This value gives the number of permanent stored procedure names in the name cache. The default value is 10000 for kernel releases 6.20; for later kernel releases the default value is 20000. Stored procedure names are stored physically in the sysobjects table, stored procedure texts are stored physically in the syscomments table in the SAP database.
From Release 4.5A, the SAP System uses trusted connections when running with SQL Server exclusively. With this method, the SQL Server login id sapr3 is not used for SAP work process connections. The Windows user running the SAP service (SAPService<SID>) connects to the database server. Access to SQL Server is controlled by the Windows NT account or group, which is checked when logging on to the operating system on the application server. When SAP work processes connect to SQL Server, they request a Windows trusted connection to SQL Server. Windows does not open a trusted connection unless the SAP application server has successfully logged on using a valid Windows account. In this case SQL Server does not have to check the account. SQL Server gets the user account information from the trusted connection properties and matches them against the Windows accounts defined as valid SQL Server logins. If SQL Server finds a match, it accepts the connection. Each database has two logical parts: data (data files) and transaction log (log files). When SQL Server and the SAP System are installed, the data files of the <SID> database are created in the directories <drive>:\<SID>DATA1\<SID>DATA1.mdf and <drive>:\<SID>DATAn\<SID>DATAn.ndf, where n is the number of the file. The data files may reside on different physical drives. SAP recommends storing the data files using RAID5. The standard installation creates 3 data files. This makes it easier to expand the database. See Chapter 6 µRegular Maintenance and Error Handling'. The transaction log file is created in the directories <drive>:\<SID>LOG1\<SID>LOG1.ldf and <drive>:\<SID>LOGm\<SID>LOGm.ldf, where m is the number of the file. The log files must be mirrored. Hardware mirroring using RAID1 is strongly recommended. The standard installation
creates one log file. After a standard installation, all SAP data files reside in the special filegroup PRIMARY. Stored procedure sp_helpfile returns the physical names and attributes of files associated with the current database. Use this stored procedure to determine the names of files attached to one database.
To detect data cache problems, the cache hit ratio is the main indicator. An improper separation of several logical database portions and other frequently accessed files may lead to disk hot spots, which results in high disk response times
An Open SQL statement coming from an ABAP program is executed as a stored procedure or as a direct statement in SQL Server. For each table, there is a set of common statements such as INSERT, UPDATE and DELETE operations. A permanent stored procedure exists for each of these statements. These stored procedures can be reused. To shorten access times, the execution plan of a stored procedure is stored in the procedure cache. SQL Server automatically allocates the necessary space from the available memory, and dynamically adjusts the size of the procedure cache. For a detailed description of this stored procedure mechanism, see Unit 2: How the SAP System uses SQL Server.
The execution plan determines which indexes are to be used for table access (step 1). Once the execution plan of the stored procedure is in the procedure cache, the procedure can be executed. The required parts of the used indexes and tables are transported from the disk into the data cache for further processing (step 2). It is helpful if the pages required are already stored in the data cache from previous use. The Query Processor selects the data required, and returns it through the database interface to the SAP work process
To display the most important performance parameters of the database, call Transaction ST04, or choose Tools Administration Monitor Performance Database Activity. An analysis is only meaningful if the database has been running for several hours with a typical workload. To ensure a significant database workload, we recommend a minimum of 500 CPU busy seconds. Note: The default values displayed in section Server Engine are relative values. To display the absolute values, press button Absolute values. Check the values in (1). The cache hit ratio (2), which is the main performance indicator for the data cache, shows the average percentage of requested data pages found in the cache. This is the average value since startup. The value should always be above 98 (even during heavy workload). If it is significantly below 98, the data cache could be too small. To check the history of these values, use Transaction ST04 and choose Detail analysis menu Performance database. A snapshot is collected every 2 hours. The current size of the data cache (3) is displayed along with its target server memory, which is the total amount of dynamic memory the server can consume since server start (4). Memory setting (5) shows the memory allocation strategy used, and shows the following: FIXED: SQL Server has a constant amount of memory allocated, which is set by SQL Server configuration parameters min server memory (MB) = max server memory (MB). RANGE: SQL Server dynamically allocates memory between min server memory (MB) < > max server memory (MB). AUTO: SQL Server dynamically allocates memory between 4 MB and 2 PB, which is set by min server memory (MB) = 0 and max server memory (MB) = 2147483647. FIXED-AWE: SQL Server has a constant amount of memory allocated, which is set by min server memory (MB) = max server memory (MB). In addition the address windowing extension functionality of Windows 2000 is enabled.
Managing SQL Server memory involves consideration of the Windows startup parameters /3GB /pae and the SQL Server configuration parameters max server memory min server memory set working set size awe enabled How memory should be configured for SQL Server further depends on: the coexistence of other SAP components on one physical machine (i.e. either a Central Instance or an Update Instance on the same machine as SQL Server runs) the amount of available RAM and virtual memory Recommended memory settings are detailed in SAP note 327494.
Before tuning memory or CPU for SQL Server, you should make sure that poor SQL statements are already tuned since they can significantly affect memory or CPU utilization. (How to detect and tune poor SQL statements is discussed later in this unit). Before increasing SQL Server memory, you must check if there is sufficient main memory available. Operating system paging is a sign of insufficient main memory; especially on the database server. Call Transaction ST06 and choose Detail analysis menu Previous hours Memory. The amount of page in per hour should not exceed 20% of the available physical memory. For optimal
performance, the value should be 0. The SQL Server memory parameters should be set as described in SAP note 327494. Before increasing the number of CPUs on the server, check the usage of the CPUs for other applications that could be moved to other servers (for example, SAP work processes). Also check the CPU utilization history of the Windows processes. Use transaction ST06, choose Detail analysis menu, then Top CPU processes. Alternatively, use the Windows Task-Manager.
Disk I/O is the most time-consuming operation for the database system. Therefore, you can significantly improve performance by: Using fast disks and I/O controllers Physically separating files with high I/O activity so that they can operate concurrently Maximum throughput can be achieved by placing the following four types of files on separate physical disks (or disk systems/controllers): paging file(s) SQL Server tempdb file(s) <SID> transaction log file(s) <SID> data files
RAID technology provides two basic functions: Higher I/O performance by striping data over multiple physical disks, and therefore achieving an even I/O distribution over all disks Fault tolerance by mirroring disks or adding parity information RAID 0 is called disk striping. All read/write operations are split into slices (usually 16-128 KB) that are spread across all disks in the RAID array. There is no redundancy added. Read and write performance is improved by equally distributing workload on all participating disks. This level provides the highest performance of all RAID levels RAID 0 is called disk striping. All read/write operations are split into slices (usually 16-128 KB) that are spread across all disks in the RAID array. There is no redundancy added. Read and write performance is improved by equally distributing workload on all participating disks. This level provides the highest performance of all RAID levels In RAID 0+1 (also called RAID 10), every disk in the disk stripe set is mirrored. It provides nearly the same performance as RAID 0 but adds redundancy at the cost of doubling the number of disks RAID 5 combines disk striping with parity. One physical disk is added to hold the parity information. The parity information is also striped along all disks. Every write operation needs 2 physical reads and 2 physical writes (read data + parity and write new data + new parity). It offers less fault tolerance, as only one disk in the whole array may fail without data loss. The total disk storage is not available to store data. There needs to be enough free space available for parity
information to equal the size of the smallest disk in the array
The paging files and the SQL Server tempdb files are temporary data files that do not require redundancy to ensure data security. For high availability, RAID 0+1 should be used. Since these files are write-intensive, they should not be placed on RAID 5 systems. If they are small enough to fit on one disk, RAID technology is not required. Transaction log files must be placed on hardware mirrored disks (RAID 1) to ensure data security. If more than one disk is used for transaction log files, additional striping should be used (RAID 0+1). Data files must be placed in a RAID 5 system to ensure data security (RAID 0+1 would be faster but more expensive).
The I/O statistics are included in the SAP SQL Server database monitor ST04 Detail Analysis menu IO per File. You can choose between tempdb and the SAP database. The column 'ms/IO' displays the average wait time for one operation in ms and is calculated from the output of SQL Server system function fn_virtualfilestats. For acceptable performance 'ms/IO' should be below 10ms for all data files (1). For the transaction log files the average wait time is much smaller than for data files (2). fn_virtualfilestats counts the statistics since SQL Server start. For more information, see SQL Server Books Online and SAP note 521750.
Disk I/O performance can be measured using the Performance Monitor (from Start, choose Settings Control Panel Administrative Tools Performance Monitor). Note: Windows does not show the physical disks in the RAID system. Therefore, you first need to divide the performance counters by the number of physical disks. Then, for RAID 1 (and RAID 0+1), multiply the number of I/O write requests by 2. For RAID 5, multiply the writes by 4 to get the number of physical disk I/O requests. When ´% Disk Time = 100´ you must check I/O performance. The disk queue length (read + write queue) per physical disk should not significantly exceed 2, otherwise there is an I/O bottleneck. To get disk performance data, enable the disk performance object using command diskperf -y (a Windows reboot is required afterwards). For long term monitoring, use the Performance Monitor to create a log file with snapshot data from the specified update intervals. For more information about using the Performance Monitor, see SAP note 110529 and Windows Online Help.
Application problems can be classified into three groups: Lockwaits These are caused by long running transactions that are holding locks, which are blocking other applications that want to get the same locks. Too many unnecessary statements Poor coding causes many small statements to be read in a loop or results in data being read twice. Poorly qualified statements, where... Insufficient selection criteria is given The database optimizer has no efficient access path to the requested data (for example, in the case of missing indexes) The databaseoptimizer does not find the best access path (due to an inappropriate execution plan)
Exclusive lockwait situations due to database locks are shown in the SAP Database Monitor (Call Transaction ST04 and choose Detail analysis menu Exclusive Lockwaits). The Host PID is the process ID of an SAP work process. The owner of the lock is shown in a blue line and holds the status granted shown by a green signal in the first column. All blocked processes are shown below and have the status waiting indicated by the red signal.
You can also view the current lockwait situation using the Enterprise Manager. On the database server, choose Management Current Activity Locks/Process ID. All the SQL process IDs are displayed. Blocking is written under the icons representing the SQL process IDs that have a blocking lock holder. Blocked By <n> is written under the icons that have a lock waiter that is waiting for a lock held by a SQL process ID. Blocked SQL processes are additionally marked with a red square in their icon. The head of a locking chain is marked with a red exclamation mark (spid 177 in the above example).
In this example, the same SELECT statement may be executed several times. If the conditions in the WHERE clause do not change, the same data is read several times from the database. To find this kind of duplication, you can run an SQL Trace (Transaction ST05). This logs the communication between the SAP work processes and the database for a particular group of SAP users.
The runtime (duration) of the statement is highlighted in red if it exceeds a given threshold value of 100.000 microseconds. If a statement returns few rows and has a long execution time, the statement must be improved. An inefficient statement can have several consequences: The database is kept busy processing many data blocks CPU load is increased on the database server An SAP work process is blocked by the report and there are high wait times for other processes Several pages are displaced from the database buffer The cache hit rate for other SQL statements suffers Performance can be harmed system-wide by expensive statements
The Database Optimizer is a cost-based optimizer that minimizes the number of pages to be read. It estimates the costs for each index used and compares these to the cost of a table scan for each table to be accessed. For this cost estimation, the Optimizer uses statistical information about the data distribution, which is stored for each index in a statistics page. There are some common reasons for long execution times: No index exists and therefore the whole table is read sequentially in a full table scan The index used is not very selective, so a large range has to be read in an index range scan The Optimizer re-uses the execution plan which keeps an unsuitable strategy The statement is badly formulated and selects much more data than is actually necessary
To determine why a statement runs slowly, you can request more details from the SQL Trace. Place the cursor on the desired line and choose: Button Display Details to show: - The complete statement - The name of the stored procedure used if a permanent stored procedure was created - If a database cursor was used for the execution - The parameter values and types used Button DDIC Information to show: - The table structure according to the SAP Data Dictionary (field definitions) - The indexes defined on the table(s) used, as written in the SAP Data Dictionary Button Explain to show: - The Optimizer decisions (for the time of the explanation). That is, the indexes used and the JOIN sequence Button Display Call Position in ABAP Program to show: The related ABAP statement (if executed from ABAP)
The stored procedure name cache of each SAP instance provides some additional space for storing statistical information about the execution of stored procedures, such as the: Duration of the slowest and average execution (in ms) Number of executions (since the statistics were turned on) Number of rows returned during the fastest, slowest, and average running execution Parameter values for the execution that returned the maximum number of rows Name of the stored procedure To view the stored procedure statistics, call Transaction ST04 and choose Detail analysis menu SAP stats on SPs. By default, the stored procedure statistics is switched on (SAP instance profile parameter dbs/mss/stats_on = 1). The statistics can be displayed for a single application server or for all the servers. You can check the contents of a specific stored procedure or display a list of the selected stored procedures.
Two times are measured for every stored procedure execution; the duration without fetches and the duration with fetches, which may be significantly higher if a larger number of rows are returned. The number of pages read or written is not displayed as this information is not known at the SAP System level. To check which procedures are responsible for the most database load, sort by column ms(+fetch). The top procedures are either slow or executed very often. To find the slow statements that are targets for optimization, sort by column Max ms(+fetch). Statements used for changes, such as UPDATE, DELETE, INSERT, and SELECT FOR UPDATE, may have a high maximum time due to lockwait situations. Also, the execution times may vary because of different parameter values. Therefore, you should sort by column Avg. ms (+fetch). The top statements are the statements that are slow on every execution. You can select a line and choose Button SQL statement to get the procedure text (the statements executed within the stored procedure) and the parameter list of the call that returned the largest number of rows (if available). You can then use the function Explain SQL to display a new explain plan with actual parameters. You can also check the Explain SP of the precompiled procedure in the SQL Server procedure cache to see if the stored execution plan is identical.
In this example, we assume that no suitable index exists and that the fields in the WHERE clause are MANDT and BISMT: SELECT * FROM MARA WHERE MANDT = xxx AND BISMT = yyy If you were to tune this expensive statement, you might consider creating a secondary index. However, creating a secondary index does not solve all problems, in fact it may have its own consequences. For example, if you increase the number of indexes on a table, the duration of INSERTS, DELETES and UPDATES also increases. Not every field has to be contained in the index. If you were to use the Explain SQL function for this example, you could see that the clustered index on MANDT and MATNR was used, but it was not very efficient. This means that the MANDT field was not helpful and should not have been selected for the index. Since there was only one row returned, the BISMT field is probably the selective one. Now you must determine how to create a useful index A good index should be used in many statements. If an index is used often, the database request time and the database load are reduced. A selective field is a field that has: Many different values in the real data in the table A small number of data rows with identical values, compared to the total number of rows within the table If an index is small, many index rows fit on one index page. This is especially useful for index row scans that may be triggered, for example, by comparison operators. A covered index contains all selected fields plus all the fields in the WHERE clause. It prevents accesses to the data pages