In Computer Science

Published on February 2017 | Categories: Documents | Downloads: 97 | Comments: 0 | Views: 772
of 13
Download PDF   Embed   Report

Comments

Content

 

science,, transaction processing is information processing that is divided into In computer science individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state.

Contents hide]] [hide • •

• • • •



1 Description 2 Methodology 2.1 Rollback  o o 2.2 Rollforward o 2.3 Deadlocks o 2.4 Compensating transaction 3 ACID criteria (Atomicity, Consistency, Isolation, Durability) 4 Implementations 5 See also 6 External references 7 Further reading

[edit] edit] Description Transaction processing is designed to maintain a database Integrity (typically a  a database or some modern filesystems) filesystems) in a known, consistent state, by ensuring that any operations carried out on the system that are interdependent are either all completed successfully or all canceled successfully. For example, consider a typical banking transaction that involves moving $700 from a customer's savings account to a customer's checking account. This transaction is a single operation in the eyes of the bank, but it involves at least two separate operations in computer terms: debiting the savings account by $700, and crediting the checking account by $700. If the debit operation succeeds but the credit does not (or vice versa), the books of the bank will not balance at the end of the day. There must therefore be a way to ensure that either both operations succeed or both fail, so that there is never any inconsistency in the bank's database as a whole. Transaction processing is designed to provide this. Transaction processing allows multiple individual operations to be linked together automatically as a single, indivisible transaction. The transaction-processing system ensures that either all operations in a transaction are completed without error, or none of them are. If some of the operations are completed but errors occur when the others are attempted, the transaction processing  proces sing system system "rolls "rolls back" all of the operations of the transaction (including the successful ones), thereby erasing all traces of the transaction and restoring the system to the consistent, known state that it was in before processing of the transaction began. If all operations of a transaction are completed successfully, the transaction is committed by the system, and all changes to the database are made permanent; the transaction cannot be rolled back once this is done.

 

Transaction processing guards against hardware and software errors that might leave a transaction  partially  partial ly completed, completed, with with the system system left in in an unknown, unknown, inconsistent inconsistent state. IIff the computer computer system system crashes in the middle of a transaction, the transaction processing system guarantees that all operations in any uncommitted (i.e., not completely processed) transactions are cancelled. Most of the time, transactions are issued concurrently. If they overlap (i.e. need to touch the same  portion of the database), database), this this can create create conflicts. conflicts. For example, example, if the custome customerr mentioned mentioned in the the example above has $1000 in his savings account and attempts to transfer $350 to a different person while at the same time moving $700 to the checking account, only one of them can succeed. However, forcing transactions to be processed sequentially (i.e. without overlapping in time) is inefficient. Therefore, under concurrency, transaction processing usually guarantees that the end result reflects a conflict-free outcome that can be reached as if executing the transactions sequentially in any order (a property called  called  serializability). serializability). In our example, this means that no matter which transaction was issued first, either the transfer to a different person or the move to the checking account has succeeded, while the other one has failed.

[edit] edit] Methodology The basic principles of all transaction-processing systems are the same. However, the terminology may vary from one transaction-processing system to another, and the terms used below are not necessarily universal.

[edit] edit] Rollback  Main article: Rollback (data management) Transaction-processing systems ensure database integrity by recording intermediate states of the database as it is modified, then using these records to restore the database to a known state if a transaction cannot be committed. For example, copies of information on the database prior to its modification by a transaction are set aside by the system before the transaction can make any modifications (this is sometimes called a before image). If any part of the transaction fails before it is committed, these copies are used to restore the database to the state it was in before the transaction began.

[edit edit]] Rollforward It is also possible to keep a separate journal separate journal of all modifications to a database (sometimes called after images). This is not required for rollback of failed transactions but it is useful for updating the database in the event of a database failure, so some transaction-processing systems provide it. If  the database fails entirely, it must be restored from the most recent back-up. The back-up will not reflect transactions committed since the back-up was made. However, once the database is restored, the journal of after images can be applied to the database ( rollforward ) to bring the database up to date. Any transactions in progress at the time of the failure can then be rolled back. The result is a database in a consistent, known state that includes the results of all transactions committed up to the moment of failure.

 

[edit edit]] Deadlocks Main article: Deadlock  In some cases, two transactions may, in the course of their processing, attempt to access the same  portion of a database database at the the same time, time, in a way that prevents prevents them from from proceeding. proceeding. For For example, example, transaction A may access portion X of the database, and transaction B may access portion Y of the database. If, at that point, transaction A then tries to access portion Y of the database while transaction B tries to access portion X, a deadlock occurs, and neither transaction can move forward. Transaction-processing systems are designed to detect these deadlocks when they occur. Typically both transactions will be cancelled and rolled back, and then they will be started again in a different order, automatically, so that the deadlock doesn't occur again. Or sometimes, just one of  the deadlocked transactions will be cancelled, rolled back, and automatically re-started after a short delay. Deadlocks can also occur between three or more transactions. The more transactions involved, the more difficult they are to detect, to the point that transaction processing systems find there is a  practical  practic al limit to the deadlocks deadlocks they they can detect. detect.

[edit edit]] Compensating transaction In systems where commit and rollback mechanisms are not available or undesirable, a compensating transaction is often used to undo failed transactions and restore the system to a  previouss state.  previou state.

[edit] edit] ACID criteria (Atomicity, Consistency, Isolation, Durability) Main article: ACID Transaction processing has these benefits: • • •



It allows sharing of computer resources among many users It shifts the time of job processing to when the computing resources are less busy It avoids idling the computing resources without minute-by-minute human interaction and supervision It is used on expensive classes of computers to help amortize the cost by keeping high rates of utilization of those expensive resources

[edit] edit] Implementations Main article: Transaction processing system Standard transaction-processing  transaction-processing software, software, notably IBM IBM''s Information Management System, System, was first developed in the 1960s, and was often closely coupled to particular   particular  database management

 

systems.. client–server computing  computing implemented similar principles in the 1980s with mixed success. systems However, in more recent years, the distributed client–server model has become considerably more difficult to maintain. As the number of transactions grew in response to various online services (especially the Web), Web), a single distributed database was not a practical solution. In addition, most online systems consist of a whole suite of programs operating together, as opposed to a strict client–server model where the single server could handle the transaction processing. Today a number of transaction processing systems are available that work at the inter-program level and which scale to large systems, including mainframes. mainframes. One well-known[citation needed ] (and open) industry standard is the X/Open Distributed Transaction Processing (DTP) (see JTA). JTA). However, proprietary transaction-processing environments such as IBM's CICS are still very popular [citation needed ], although CICS has evolved to include open industry standards as well. A modern transaction processing implementation combines elements of both object-oriented  persistence  persis tence with with traditional traditional transaction transaction monitoring. monitoring.[citation needed ] One such implementation is the db4o.. commercial DTS/S1 product from Obsidian Dynamics, Dynamics, or the open-source product db4o The term 'Extreme Transaction Processing' (XTP) has been used to describe transaction processing systems with uncommonly challenging requirements, particularly throughput requirements (transactions per second). Such systems may be implemented via distributed or cluster style architectures. A transactio of information system. system. TPSs collect, transaction n processing processing system is a type of information store, modify, and retrieve the transactions of an organization. A transaction is an event that generates or modifies data that is eventually stored in an information system. It is recommended test.. The essence of a transaction that a transaction processing system should pass the ACID test  program  progra m is that that it manages manages data data that must must be left left in a consiste consistent nt state, state, e.g. if an electronic electronic payment payment is made, the amount must be both withdrawn from one account and added to the other; it cannot complete only one of those steps. Either both must occur, or neither. In case of a failure preventing transaction completion, the partially executed transaction must be 'rolled back ' by the TPS. While this type of integrity must be provided also for  batch  batch tr transactio ansaction n processing processing,, it is particularly important for online processing: if e.g. an airline seat reservation system is accessed by multiple operators, after an empty seat inquiry, the seat reservation data must be locked until the reservation is made, otherwise another user may get the impression a seat is still free while it is actually being  booked at the time. time. Without Without proper proper transaction transaction monitori monitoring, ng, double bookings may occur. occur. Other  transaction monitor functions include deadlock  deadlock detection detection and resolution (deadlocks may be inevitable in certain cases of cross-dependence on data), and transaction logging (in 'journals') for  'forward recovery' in case of massive failures. Transaction Processing is not limited to application programs. For example, Journaling file systems  also employ the notion of transactions. systems

Contents [hide] hide] •

1 Ty Type pes s

 

Contraste d with batch processing 1.1 Contrasted o 1.2 Real-time Real-tim e and batch processing 2 Fe Feat ature ures s 2.1 Rap Rapid id res respons ponse e o 2.2 Rel Reliabi iabilit lity y o o 2.3 Inf Inflex lexibi ibilit lity y o 2.4 Controlled processing o







3 Co Comp mpone onent nts s 4 ACID test properties: propertie s: first defini definition tion 4.1 Ato Atomic micity ity o o 4.2 Cons Consist istency ency o 4.3 Iso Isolat lation ion 4.4 Dura Durabil bility ity o 4.5 Conc Concurre urrency ncy o 5 Sto Storin ring g and ret retriev rieving ing o 5.1 Databases Databa ses and fil files es o 5.2 Dat Data a war warehous ehouse e 5.3 Bac Backup kup pro procedu cedures res o 5.3. 5.3.1 1 Rec Recover overy y proc process ess 5.3.2 Types of back-up procedures 5.3.2. 5.3.2.1 1 Grandfat Grandfather-fathe her-father-son r-son 5.3.2. 5.3.2.2 2 Partia Partiall backups 5.3. 5.3.3 3 Upda Updating ting in a bat batch ch 5.3.4 Updating in real-t real-time ime 6 Re Refe fere rence nces s 7 Se See e al also so



8 Fur Further ther rea reading ding





 

 

 



[edit] edit] Types [edit] edit] Contrasted with batch processing Batch processing is a form of transaction processing. Batch processing involves processing several transactions at the same time, and the results of each transaction are not immediately available when the transaction is being entered;[1] there is a time delay. Transactions are accumulated for a certain period (say for day) where updates are made especially after work. Online transaction  processing  proces sing is the the form of of transaction transaction processing processing that processe processess data as it becomes becomes available. available.

[edit] edit] Real-time and batch processing There are a number of differences between real-time and batch processing. These are outlined  below: Each transaction in real-time processing is unique. It is not part of a group of transactions, even though those transactions are processed in the same manner. Transactions in real-time processing are stand-alone both in the entry to the system and also in the handling of output.

 

Real-time processing requires the master file to be available more often for updating and reference than batch processing. The database is not accessible all of the time for batch processing. Real-time processing has fewer errors than batch processing, as transaction data is validated and entered immediately. With batch processing, the data is organised and stored before the master file is updated. Errors can occur during these steps. Infrequent errors may occur in real-time processing; however, they are often tolerated. It is not  practical  practic al to shut down the system system for infrequent infrequent errors. errors. More computer operators are required in real-time processing, as the operations are not centralised. It is more difficult to maintain a real-time processing system than a batch processing system.

[edit edit]] Features [edit] edit] Rapid response Fast performance Fast   performance with a rapid response time is critical. Businesses cannot afford to have customers waiting for a TPS to respond, the turnaround time from the input of the transaction to the  production  producti on for the the output must must be a few few seconds seconds or less. less.

[edit] edit] Reliability Many organizations rely heavily on their TPS; a breakdown will disrupt operations or even stop the business. For a TPS to be effective its failure rate must be very low. If a TPS does fail, then quick and accurate recovery must be possible. This makes well–designed backup well–designed backup  and recovery  procedures  procedu res essential. essential.

[edit edit]] Inflexibility A TPS wants every transaction to be processed in the same way regardless of the user, the customer or the time for day. If a TPS were flexible, there would be too many opportunities for  non-standard operations, for example, a commercial airline  airline needs to consistently accept airline reservations from a range of travel agents, accepting different transactions data from different travel agents would be a problem.

edit]] Controlled processing [edit The processing in a TPS must support an organization's operations. For example if an organization allocates roles and responsibilities to particular employees, then the TPS should enforce and maintain this requirement. An example of this is an ATM transaction.

edit]] Components [edit 1.Input

 

2.Processing 3.Storage 4.Output

[edit] edit] ACID test properties: first definition [edit] edit] Atomicity Main article: Atomicity (database systems)

A transaction’s changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers. [2]

[edit edit]] Consistency Consistency Consistency:: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program![2]

[edit edit]] Isolation Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both. both .[2]

[edit edit]] Durability Once a transaction completes successfully (commits), its changes to the state survive failures. [2]

[edit edit]] Concurrency Ensures that two users cannot change the same data at the same time. That is, one user cannot change a piece of data before another user has finished with it. For example, if an airline ticket agent starts to reserve the last seat on a flight, then another agent cannot tell another passenger that a seat is available

edit]] Storing and retrieving [edit Storing and retrieving information from a TPS must be efficient and effective. The data are stored in warehouses or other databases, the system must be well designed for its backup and recovery  procedures.  procedu res.

[edit edit]] Databases and files

 

The storage and retrieval of data must be accurate as it is used many times throughout the day. A database is a collection of data neatly organized, which stores the accounting and operational records in the database. database. Databases are always protective of their delicate data, so they usually have a restricted view of certain data. Databases are designed using hierarchical, network or relational structures; each structure is effective in its own sense. •





Hierarchical structure: organizes data in a series of levels, leve ls, hence why it is nodes and called hierarchal. Its top to bottom like structure consists of  of nodes branches; each child node has branches and is only linked to one higher level parent node. Network structure: Similar to hierarchical, network structures also organizes data using nodes and branches. But, unlike hierarchical, each child node can be linked to multiple, higher parent nodes. Relational structure: Unlike network and hierarchical, a relational database organizes its data in a series of related tables. This gives flexibility as relationships between the tables are built.

A relational structure.

A hierarchical structure.

A network structure.

The following features are included in real time transaction processing systems: •







Good data placement: The database should be designed to access patterns of data from many simultaneous users. Short transactions: Short transactions t ransactions enables quick processing. This avoids concurrency and paces the systems. Real-time backup: Backup should be scheduled between low times of activity to prevent lag of the server. High normalization: This lowers redundant information to increase the speed and improve concurrency, this also improves backups.

 





Archiving of historical data: Uncommonly used data are moved into other ot her databases or backed up tables. This keeps tables small and a nd also improves backup times. Good hardware configuration: Hardware must be able to handle many users and provide quick response times.

In a TPS, there are 5 different types of files. The TPS uses the files to store and organize its transaction data: •



• • •

Master file: Contains information about an a n organization’s business situation. Most transactions and databases are stored in the master file. Transaction file: It is the collection of transaction records. It helps to update the master file and also serves as audit trails and transaction history. Report file: Contains data that has been bee n formatted for presentation to a user. Work file: Temporary files in the t he system used during the processing. Program file: Contains the instructions for the processing of data.

[edit edit]] Data warehouse Main article: Data warehouse

A data warehouse is a database that collects information from different sources. When it's gathered in real-time transactions it can be used for analysis efficiently if it's stored in a data warehouse. It  providess data that are consolidated , subject  provide  subject-orient -oriented  ed , historical and read-only: •







Consolidated: Data are organised with consistent naming conventions, measurements, attributes and semantics. It allows data from a data warehouse from across the organization to be effectively used in a consistent manner. Subject-oriented: Large amounts of data dat a are stored across an organization, some data could be irrelevant for reports and makes querying the data difficult. It organizes only key business information from operational sources so that it's available for analysis. Historical: Real-time TPS represent the current value at any time, an example could be stock levels.ItIfstores past data areofkept, querying database could return a different response. series snapshots for the an organisation's operational data generated over a period of time. t ime. Read-only: Once data are moved into a data warehouse, it becomes readonly, unless it was incorrect. Since it represents a snapshot of a certain time, it must never be updated. Only operations which occur in a data warehouse are loading and querying data.

[edit edit]] Backup procedures

 

A Dataflow Diagram of backup and recovery procedures.

Since business organizations have become very dependent on TPSs, a breakdown in their TPS may stop the business' regular routines and thus stopping its operation for a certain amount of time. In order to prevent data loss and minimize disruptions when a TPS breaks down a well-designed  backup and recovery procedure is put into use. The recovery process can rebuild the system when it goes down.  [ edit  edit   ] Recovery process ess Recovery proc

A TPS may fail for many reasons. These reasons could include a system failure, human errors, viruses,, software application errors or natural hardware failure, incorrect or invalid data, computer viruses or man-made disasters. As it's not possible to prevent all TPS failures, a TPS must be able to cope with failures. The TPS must be able to detect and correct errors when they occur. A TPS will go the  backup,  backup, journal, through a recovery of the database  database to cope when the system fails, it involves the  checkpoint, and recovery manager: •



 Journal: A journal  Journal: journal maintain maintains s an audit trail of transac transactions tions and database database changes. Transaction logs and Database change logs are used, a transaction log records all the essential data for f or each transactions, including data values, time of transaction and terminal number. A database change log contains before and after copies of records that have been modified by transactions. Checkpoint: The purpose of checkpointing is to provide a snapshot of the data within the database. A checkpoint, in general, is any identifier or other reference that identifies at a point in time the state of the database. Modifications to database pages are performed in memory and are not necessarily written to disk after every update. Therefore, periodically, the database system must perform a checkpoint to write these updates which are held in-memory to the storage st orage disk. Writing these updates to storage disk creates a point in time t ime in which the database system can apply changes contained in a transaction log during recovery after an unexpected shut down or crash of the database system.

If a checkpoint is interrupted and a recovery is required, then the database system must start recovery from a previous successful checkpoint. Checkpointing can be either transactionconsistent or non-transaction-consistent (called also fuzzy checkpointing). Transaction-consistent  checkpointing  produces a persistent database image that is sufficient to recover the database to the

 

state that was externally perceived at the moment of starting the checkpointing.  A non-transact non-transactionionconsistent checkpointing results in a persistent database image that is insufficient to perform a recovery of the database state. To perform the database recovery, additional information is needed, typically contained in transaction logs. Transaction consistent checkpointing refers to a consistent database, which doesn't necessarily include all the latest committed transactions, but all modifications made by transactions, that were committed at the time checkpoint creation was started, are fully present. A non-consistent transaction refers to a checkpoint which is not necessarily a consistent database, and can't be recovered to one without all log records generated for open transactions included in the checkpoint. Depending on the type of database management system implemented a checkpoint may incorporate indexes or storage pages (user data), indexes and storage pages. If no indexes are incorporated into the checkpoint, indexes must be created when the database is restored from the checkpoint image. •

Recovery Manager: A recovery manager is a program which restores the database to a correct condition which can restart the transaction processing.

Depending on how the system failed, there can be two different recovery procedures used. Generally, the procedures involves restoring data that has been collected from a backup device and then running the transaction processing again. Two types of recovery are backward recovery and  forward  forwar d recovery recovery: •



Backward recovery: used to undo unwanted changes to the database. It reverses the changes made by transactions which have been aborted. It involves the logic of reprocessing r eprocessing each transaction, which is very timeconsuming. Forward recovery: it starts with a backup copy of the database. The transaction t ransaction will then reprocess according to the transaction journal that occurred between the time the backup back up was made and the present time. It' It's s much faster and more accurate.

See also: Checkpoint restart

 [ edit  edit   ] Types back-up up proce procedure dures s Types of back-

There are two main types of Back-up Procedures: Grandfather-father-son and Partial backups: edit]] Grandfather-father-so Gra ndfather-father-son n [edit

This procedure refers to at least three generations of   backup backup master files. thus, the most recent  backup is the son, the the oldest backup is the grandfathe grandfather. r. It's commonly commonly used for for a batch transaction  processing  process ing system system with a magnetic tape tape.. If the system fails during a batch run, the master file is recreated by using the son backup and then restarting the batch. However if the son backup fails, is corrupted or destroyed, then the next generation up backup (father) is required. Likewise, if that fails, then the next generation up backup (grandfather) is required. Of course the older the generation, the more the data may be out of date. Organizations can have up to twenty generations of backup.

 

[edit edit]] Partial backups

This only occurs when parts of the master file are backed up. The master file is usually backed up to magnetic tape at regular times, this could be daily, weekly or monthly. Completed transactions since the last backup are stored separately and are called  jour  journals nals, or  journal  journal files files. The master file can be recreated from the journal files on the backup tape if the system is to fail.  [ edit  edit   ] Updatin Updating g in a batch batch

This is used when transactions are recorded on paper (such as bills and invoices) or when it's being stored on a  a magnetic tape. tape. Transactions will be collected and updated as a batch at when it's convenient or economical to process them. Historically, this was the most common method as the information technology did not exist to allow real-time processing. The two stages in batch processing are: •



Collecting and storage of the transaction data into a transaction file - this t his involves sorting the data into sequential order. Processing the data by updating the master file - which can be difficult, this may involve data additions, updates and deletions that may require to happen in a certain order. If an error occurs, then the entire batch fails.

Updating in batch requires sequential access - since it uses a  a magnetic tape this is the only way to access data. A batch will start at the beginning of the tape, then reading it from the order it was stored; it's very time-consuming to locate specific transactions. storage  medium which can store large The information technology used includes a secondary storage tape). ). The software used quantities of data inexpensively (thus the common choice of a  a magnetic tape to collect data does not have to be online - it doesn't even need a  user interface. interface. edit   ] Updatin Updating g in realreal-time time  [ edit 

This is the immediate processing of data. It provides instant confirmation of a transaction. This involves a large amount of users who are simultaneously performing transactions to change data. Because of advances in technology (such as the increase in the speed of  data transmission and ), real-time updating is possible. larger   bandwidth bandwidth), Steps in a real-time update involve the sending of a transaction data to an online database in a master file. The person providing information is usually able to help with error correction and receives confirmation of the transaction completion. Updating in real-time uses  uses direct access of data. This occurs when data are accessed without accessing previous data items. The storage device stores data in a particular location based on a mathematical procedure. This will then be calculated to find an approximate location of the data. If  data are not found at this location, it will search through successive locations until it's found.

 

storage medium that can store large The information technology used could be a secondary storage  ). amounts of data and provide quick access (thus the common choice of a  a  magnetic disk ).

[edit] edit] See also

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close