Oracle

Published on January 2017 | Categories: Documents | Downloads: 108 | Comments: 0 | Views: 1783
of 16
Download PDF   Embed   Report

Comments

Content

Kranthi

ORACLE
Own Preparation What is a Database? The definition of a database is a structured collection of records or data that is stored in a computer system. Types of Databases There are several common types of databases; each type of database has its own data model (how the data is structured). They include; Flat Model, Hierarchical Model, Relational Model Network Model. The Flat Model Database: In a flat model database, there is a two dimensional (flat structure) array of data. For instance, there is one column of information and within this column it is assumed that each data item will be related to the other. For instance, a flat model database includes only zip codes. Within the database, there will only be one column and each new row within that one column will be a new zip code. The Hierarchical Model Database The hierarchical model: database resembles a tree like structure, such as how Microsoft Windows organizes folders and files. In a hierarchical model database, each upward link is nested in order to keep data organized in a particular order on a same level list. For instance, a hierarchal database of sales, may list each days sales as a separate file. Within this nested file are all of the sales (same types of data) for the day. In a Network Model: the defining feature is that a record is stored with a link to other records - in effect networked. These networks (or sometimes referred to as pointers) can be a variety of different types of information such as node numbers or even a disk address. The Relational Model: The relational model is the most popular type of database and an extremely powerful tool, not only to store information, but to access it as well. Relational databases are organized as tables. The beauty of a table is that the information can be accessed or added without reorganizing the tables. A table can have many records and each record can have many fields. Tables are sometimes called a relation. For instance, a company can have a database called customer orders, within this database will be several different tables or relations all relating to customer orders. Tables can include customer information (name, address, contact, info, customer number, etc) and other tables (relations) such as orders that the customer previously bought (this can include item number, item description, payment amount, payment method, etc). It should be noted that every record (group of fields) in a relational database has its own primary key. A primary key is a unique field that makes it easy to identify a record. Relational databases use a program interface called SQL or Standard Query Language. SQL is currently used on practically all relational databases. Relational databases are extremely easy to

customize to fit almost any kind of data storage. You can easily create relations for items that you sell, employees that work for your company, etc.

Accessing Information Using a Database While storing data is a great feature of databases, for many database users the most important feature is quick and simple retrieval of information. In a relational database, it is extremely easy to pull up information regarding an employee, but relational databases also add the power of running queries. Queries are requests to pull specific types of information and either show them in their natural state or create a report using the data. For instance, if you had a database of employees and it included tables such as salary and job description, you can easily run a query of which jobs pay over a certain amount. No matter what kind of information you store on your database, queries can be created using SQL to help answer important questions. Storing a Database Databases can be very small (less than 1 MB) or extremely large and complicated (terabytes as in many government databases), however all databases are usually stored and located on hard disk or other types of storage devices and are accessed via computer. Large databases may require separate servers and locations, however many small databases can fit easily as files located on your computer's hard drive. Securing a Database Obviously, many databases store confidential and important information that should not be easily accessed by just anyone. Many databases require passwords and other security features in order to access the information. While some databases can be accessed via the internet through a network, other databases are closed systems and can only be accessed on site. What is a Relational Database? A relational database stores data in separate tables instead of placing all data in one large table. A relational database then allows Data Base Administrator's (DBA's) to define relationships between these tables. These relationships enable DBA's to combine data from several tables for querying and reporting. This is accomplished through the use of keys, which are database fields used to uniquely identify specific records in a table. Relational database technology allows databases to be larger, faster, and more efficient. The concept of a relational database was first developed by Dr. Edger F. (Ted) Codd in A Relational Model of Data for Large Shared Data Banks in 1970. Dr. Codd defined thirteen standards which must be met before a database can be considered to be a relational database: A relational DBMS must be able to manage databases entirely through its relational capabilities. 1. Information rule-- All information in a relational database (including table and column names) is represented explicitly as values in tables.

2. Guaranteed access--Every value in a relational database is guaranteed to be accessible by using a combination of the table name, primary key value, and column name. 3. Systematic null value support--The DBMS provides systematic support for the treatment of null values (unknown or inapplicable data), distinct from default values, and independent of any domain. 4. Active, online relational catalog--The description of the database and its contents is represented at the logical level as tables and can therefore be queried using the database language. 5. Comprehensive data sublanguage--At least one supported language must have a well-defined syntax and be comprehensive. It must support data definition, manipulation, integrity rules, authorization, and transactions. 6. View updating rule--All views that are theoretically updatable can be updated through the system. 7. Set-level insertion, update, and deletion -- The DBMS supports not only setlevel retrievals but also set-level inserts, updates, and deletes. 8. Physical data independence--Application programs and ad hoc programs are logically unaffected when physical access methods or storage structures are altered. 9. Logical data independence--Application programs and ad hoc programs are logically unaffected, to the extent possible, when changes are made to the table structures. 10. Integrity independence--The database language must be capable of defining integrity rules. They must be stored in the online catalog, and they cannot be bypassed. 11. Distribution independence--Application programs and ad hoc requests are logically unaffected when data is first distributed or when it is redistributed. 12. Nonsubversion--It must not be possible to bypass the integrity rules defined through the database language by using lower-level languages. Alternatives to the relational database model include the heirarchical model, the network model, and the object model. What is a Flat File? A computer, in its most simplified definition, is nothing more than a device that stores, processes, communicates, and manipulates data. Data is at the heart of every computer program, every web site, and even every video game. The computer has revolutionized modern life by enabling a level of speed and accuracy to handling information that has never before been possible. Computers accomplish this quite simply by following a set of instructions called code.

A flat file is a static document, spreadsheet, or textual record that typically contains data that is not structurally related. Flat files are called so because there is little that can be accomplished with the information contained in them other than reading, storing, and sending. Flat files are typically basic data sets that are used for configuration data storage for applications and programs. The average computer user will not typically see them much. There are basic address booklike examples of a flat file that can be used, but modern users typically desire a more robust and sophisticated method of calling up their information than to rely on the limited capability of a flat file structure. Flat files are commonly used by and found in database management systems, typically belonging in and organized into flat file databases. A flat file database is really nothing more than an organization criterion assigned to a sample set of flat files. Programmers probably use flat files and flat file databases much more often than the average computer when constructing applications in database management systems such as MySQL, a popular DBS that supports multiple programming languages. Flat files are also commonly used by website developers for use within languages they utilize such as PHP or ASP. It has been noted that flat files, since they are simple data files, take up much less space than structured files. However, the utility reading flat files must be rather sophisticated in that it will need to know what to do with the flat file once it has been accessed. What is SQL? SQL (Structured Query Language) is the most common standardized language used to access databases. SQL was originally developed by IBM in the 1970's for their DB2 RDBMS. SQL version 3 is officially defined by the American National Standards Instititute (ANSI) in the ANSI SQL:1999 standard. Most existing DBMS's currently conform to the earlier ANSI SQL92 standard. SQL is a nonprocedural language. Oracle produces a procedural version of SQL which it calls PL/SQL. SQL is supported by every major database system in use today, including MySQL, PostgreSQL, Berkeley DB, Oracle, DB2, Sybase, Informix, Microsoft SQL.

Firebird SQL is often pronounced "sequel". What is a Transactional Database? A transactional database is a DBMS where write operations on the database are able to be rolled back if they are not completed properly. If a transactional database system loses electrical power half-way through a transaction, the partially completed transaction will be rolled back and the database will be restored to the state it was in before the transaction started. Imagine that a front-end application is sending a customer order to a database system. The front-end application sends the request to product to the customer and subtract the product from inventory. The front-end application is about to send the request to create an invoice for the customer and suddenly the front-end application crashes. A transactional database can then roll back the partially completed transaction. An alternative to using a transactional database is to use atomic operation. What is a Stored Procedure? A stored procedure is a set of SQL commands that has been compiled and stored on the database server. Once the stored procedure has been "stored", client applications can execute the stored procedure over and over again without sending it to the database server again and without compiling it again. Stored procedures improve performance by reducing network traffic and CPU load. What is a Database trigger? A database trigger is a stored procedure that is invoked automatically when a predefined event occurs. Database triggers enable DBA's (Data Base Administrators) to create additional relationships between separate databases. For example, the modification of a record in one database could trigger the modification of a record in a second database. What is Database Replication? Database replication is the creation and maintenance of multiple copies of the same database. In most implementations of database replication, one database server maintains the master copy of the database and additional database servers maintain slave copies of the database.

Database writes are sent to the master database server and are then replicated by the slave database servers. Database reads are divided among all of the database servers, which results in a large performance advantage due to load sharing. In addition, database replication can also improve availability because the slave database servers can be configured to take over the master role if the master database server becomes unavailable.

What is a Data Dictionary? A data dictionary is a reserved space within a database which is used to store information about the database itself. A data dictionary may contain information such as: * Database design information * Stored SQL procedures * User permissions * User statistics * Database process information * Database growth statistics * Database performance statistics What is Database Normalization? Database normalization is the process of organizing data into distinct and unique sets. The purposes of normalization are to: * Reduce or eliminate storage of duplicate data * Organize data into an efficient and logical structure The process of normalization involves determining what data should be stored in each database table. By tradition, the process of normalization involves working through well-defined steps, called normal forms. In First Normal Form (1NF) you eliminate duplicate columns from the same table, create separate tables for each group of related data, and identify each row with a unique column or set of columns (the primary keys). In Second Normal Form (2NF) you remove subsets of data that apply to multiple rows of a table, place them in separate tables, and create relationships between these new tables and the original tables through the use of foreign keys. In Third Normal Form (3NF) you remove columns that are not dependent upon the primary key. Additional normal forms have been defined, but are less commonly utilized. These advanced normal forms include Fourth Normal Form (4NF), Fifth Normal Form (5NF), Boyce Codd Normal Form (BCNF), and Domain-Key Normal Form (DK/NF). What is a BLOB?

A BLOB (Binary Large OBject) is a large chunk of data which is stored in a database. A BLOB differs from regular database data in that it is not forced into a certain structure. A normal database field might be structured to be 14 characters long and only accept lowercase letters. A BLOB field is not usually restricted in content type and content can be several gigabytes in size. Normal database fields have space allocated for them whether they are utilized or not. BLOB fields are only allocated space when they are utilized. BLOB fields are normally used to store graphics, audio, video, or documents. BLOB fields can be added, changed, and deleted. However, they cannot be searched and manipulated with standard database commands. What is Free Database Software? MySQL The MySQL database server is the world's most popular open-source database server. Over six million installations use the MySQL database server to power high-volume web sites and other mission critical business systems including industry-leaders like NASA, Yahoo, The Associated Press (AP), Suzuki, and Sabre Holdings. MySQL is an attractive alternative to high-cost, more complex database technology. Its award-winning reliability, scalability and speed make it the right choice for a wide range of corporate IT departments, web developers and software vendors. MySQL offers several key advantages: * Reliability and Performance. MySQL AB makes early versions of all its database server software available to the open source community to allow for several months of "battle testing" before it deems them ready for production use. * Ease of Use and Deployment. The MySQL architecture makes it extremely fast and simple to customize. The unique multi-storage engine architecture of MySQL gives corporate customers the flexibility they need with a DBMS unmatched in stability, speed, compactness, and ease of deployment. * Freedom from Platform Lock-in. By providing ready access to the source code, MySQL AB's approach ensures freedom, thereby preventing lock-in to a single vendor or platform. * Cross-Platform Support. MySQL is available on more than twenty different platforms including all major Linux distributions, Unix, Microsoft Windows, and Mac OS X. * Millions of Trained and Certified Developers. MySQL is the world's most popular open source database. This makes it easy to find knowledgable and experienced DBA's and developers. PostgreSQL PostgreSQL is an extremely scalable, SQL compliant, open -ource object-relational DBMS. With more than 15 years of development history, PostgreSQL is quickly becoming the de facto database for enterprise level open-source solutions. PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES, Version 4.2, developed at the University of California at Berkeley Computer Science Department. POSTGRES pioneered many concepts that only became available in commercial database systems much later. PostgreSQL is an open-source descendant of this original Berkeley code. It supports SQL92 and SQL99 and offers many modern features: * Complex queries * Foreign keys

* Triggers * Views * Transactional integrity * Multiversion concurrency control BerkeleyDB Berkeley DB is one of the most widely-used developer databases in the world, is open source and runs on all major operating systems, including embedded Linux, Linux, Unix, Microsoft Windows, Mac OS X, VxWorks and QNX. Berkeley DB provides the core data management functionality, scalability, power and flexibility of enterprise relational databases but without the overhead of a query processing layer. Combined with the stability and lower support cost of open source code, Berkeley DB provides many advantages, including: * Zero administration cost eliminates the need for a DBA * Smaller footprint (less than 500Kb) * Simplicity of integration into an application * More speed and higher performance * Less complexity and more reliability

Firebird Firebird is a relational database with many ANSI SQL-99 features that runs on Windows, Linux, and a variety of Unix platforms. Firebird provides excellent concurrency, high performance, and powerful language support for stored procedures and database triggers. Firebird has been used in production systems, under a variety of names, since 1981. Firebird is a commercially independent project of C and C++ programmers, technical advisors and supporters developing and enhancing a multi-platform relational database management system based on the source code released by Borland Software Corp on July25th, 2000 under the InterBase Public License. Firebird is completely free of any registration, licensing or deployment fees. Firebird may be deployed freely for use with any third-party software, whether commercial or not. What is ODBC? ODBC is an abbreviation for Open Database Connectivity, and is an interface to access databases via SQL queries. The ODBC can be used as an access tool to various databases such as MS-Access, dBase, DB2, Excel, and Text. Through these Call Level Interface (CLI) specifications of the SQL Access Group, the OBDC allows a neutral way of accessing the data stored in personal computers and various databases. It was first created by Microsoft and Simba Technologies. ODBC 1.0 was initially released in September 1992, and eventually became a part of the international SQL standard in 1995. Microsoft originally shipped the ODBCs as a set of DLLs and still today, ships the set with every copy of Microsoft Windows. Today, more and more operating systems such as UNIX and Apple are adopting the ODBC.

The strength of ODBC is that by providing a universal data access interface, it allows independent software companies and parties to not have to learn multiple application programming interfaces. To simply put, with ODBC, applications can simultaneously access, view, and modify database from numerous and quite diverse databases. This is because the ODBC "re-codes" the SQL queries so that it would be readable by the various different databases. However, the ODBC also has its drawbacks. As managing a huge number of ODBC clients can mean an immense amount of drivers and DLLs being run, this could lead to a system administration overhead. Nonetheless, this minor issue led to further uses of the ODBC server technology (or the "Multi-Tier ODBC Drivers") to alleviate the load. Also, as drivers are a key in ODBCs, some have raised the issue of the newer drivers' stability, as often many have shown to have bugs in them. How ODBC is Processed To use the ODBC, three components are needed: ODBC client, ODBC driver, and a DBMS server (ex. Microsoft Access, SQL Server, Oracle, and FoxPro). Firstly, the ODBC client will use a command (referred to as "ODBC") to interact (requesting and/or sending data) with the DBMS server (back-end). However, the DBMS server will not understand the command by the ODBC client yet, as the command has yet to be processed through the ODBC driver (front-end). So then, the ODBC driver will decode the command that can be processed by the ODBC server and be sent there. The ODBC server will then respond back to the ODBC driver which will translate the final output to the ODBC client. What is JDBC? JDBC (Java Data Base Connectivity) is an API (Application Programming Interface) for connecting to databases from the Java environment. JDBC is an alternative to ODBC. JDBC's Java interface is more comfortable to Java programmers than ODBC's C language interface. JDBC is included with both J2SE and J2EE. If no JDBC driver is available for your needs, a JDBC-ODBC bridge may be used to connect to an ODBC driver via the JDBC API. Java 2 includes a JDBC-ODBC bridge for Solaris and Microsoft Windows. What is a DSN? A DSN (Data Source Name) is an identifier which defines a data source for an ODBC driver. A DSN consists of information such as: * Database name * Directory * Database driver * User ID * Password Under Unix, DSN configuration is usually stored in /etc/odbc.ini. Under Microsoft Windows, DSN configuration is normally stored in the registry, although it may also be stored in configuration files with a .dsn extension.

What are Java Databases? hsqldb HSQLDB is the leading SQL relational database engine written in Java. HSQLDB has a JDBC driver and supports a rich subset of ANSI-92 SQL (BNF tree format) plus SQL 99 and 2003 enhancements. HSQLDB offers a small (less than 100k in one version), fast database engine which offers both inmemory and disk-based tables. Embedded and server modes are available. Additionally, it includes tools such as a minimal web server, in-memory query and management tools (can be run as applets) and a number of demonstration examples. The product is currently being used as a database and persistence engine in many Open Source Software projects and even in commercial projects and products. In it's current version it is extremely stable and reliable. HSQLDB is best known for its small size, ability to execute completely in memory and its speed. This feature-packed software is completely free under our licenses , based on the standard BSD license. Yes, that's right, completely free of cost or onerous restrictions and fully compatible with all major open source licenses. Java source code and extensive documentation always included. Berkeley DB Java Edition Berkeley DB JE is a high performance, transactional storage engine written entirely in Java. Like the highly successful Berkeley DB product, Berkeley DB JE executes in the address space of the application, without the overhead of client/server communication. It stores data in the application's native format, so no runtime data translation is required. Berkeley DB JE supports full ACID transactions and recovery. It provides an easy-to-use interface, allowing programmers to store and retrieve information quickly, simply and reliably. Berkeley DB JE was designed from the ground up in Java. It takes full advantage of the Java environment. The Berkeley DB JE API provides a Java Collections-style interface, as well as a programmatic interface similar to the Berkeley DB API. The architecture of Berkeley DB JE supports high performance and concurrency for both read-intensive and write-intensive workloads. Berkeley DB JE is different from all other Java databases available today. Berkeley DB JE is not a relational engine built in Java. It is a Berkeley DB-style embedded store, with an interface designed for programmers, not DBAs. Berkeley DB JE's architecture employs a log-based, no-overwrite storage system, enabling high concurrency and speed while providing ACID transactions and record-level locking. Berkeley DB JE efficiently caches most commonly used data in memory, without exceeding application-specified limits. In this way Berkeley DB JE works with an application to use available JVM resources while providing access to very large data sets. The Berkeley DB JE architecture provides an underlying storage layer for any Java application requiring high performance, transactional integrity and recoverability. IBM Cloudscape IBM Cloudscape is a pure, open source-based Java relational database management system that can be embedded in Java programs and used for online transaction processing (OLTP). A platformindependent, small-footprint (2MB) database, Cloudscape integrates tightly with any Java-based solution. What is a Data Warehouse?

A data warehouse is a place where data is stored for archival purposes, analysis purposes and security purposes. Usually a data warehouse is either a single computer or many computers (servers) tied together to create one giant computer system. Data can consist of raw data or formatted data and can be on various types of topics including an organization's sales, salaries, operational data, summaries of data including reports, copies of data, human resource data, inventory data, external data to provide simulations and analysis, etc. Besides being a store house for large amounts of data, they must possess systems in place that make it easy to access the data and use it in day to day operations. A data warehouse is sometimes said to be a major part in a decision support system. A way for an organization to use data to come up with facts, trends or relationships that can help them make effective decisions or create effective strategies to accomplish their goals. There are many different models of data warehouses including Online Transaction Processing which is a warehouse built for speed and ease of use. Another type of data warehouse is called Online Analytical processing, this type of warehouse is more difficult to use and adds an extra step of analysis within the data. Usually it requires more steps which slows the process down and much more data in order to analyze certain queries. In addition to this model, one of the more common data warehouse models include a data warehouse that is subject oriented, time variant, non volatile and integrated. Subject oriented means that data is linked together and is organized by relationships. Time variant means that any data that is changed in the data warehouse can be tracked. Usually all changes to data are stamped with a time date and with a before and after value, so that you can show the changes through out a period of time. Non volatile means that data is never deleted or erased. This is a great way to protect your most crucial data. Because this data is retained, you can continue to use it in a later analysis. Finally, the data is integrated, which means that a data warehouse uses data that is organizational wide instead of from just one department. Besides the term data warehouse, a term that is frequently used is a data mart, data marts are smaller, less integrated data housings. They might be just a database on human resources records or sales data on just one division. With improvements in technology, as well as innovations in using data warehousing techniques, data warehouses have changed from Offline Operational Databases to include an Online Integrated data warehouse. Offline Operational Data Warehouses are data warehouses where data is usually copied and pasted from real time data networks into an offline system where it can be used. It is usually the simplest and less technical type of data warehouse. Offline Data Warehouses are data warehouses that are updated frequently either daily, weekly or monthly and that data is then stored in an integrated structure, where others can access it and perform reporting. Real Time Data Warehouses are data warehouses where it is updated each moment with the influx of new data. For instance, a Real Time Data Warehouse might incorporate data from a Point of Sales system and is updated with each sale that is made.

Integrated Data Warehouses are data warehouses that can be used for other systems to access them for operational systems. Some Integrated Data Warehouses are used by other data warehouses, allowing them to access them to process reports, as well as look up current data. So why should you or your organization use a Data Warehouse? Here are some of the pros and cons of using this type of structure for data. The number one reason why you should implement a data warehouse is so that employees or end users can access the data warehouse and use the data for reports, analysis and decision making. Using the data in a warehouse can help you locate trends, focus on relationships and help you understand more about the environment that your business operates in. Data warehouses also increase the consistency of the data and allows it to be checked over and over to determine how relevant it is. Because most data warehouses are integrated, you can pull data from many different areas of your business, for instance human resources, finance, IT, accounting, etc. While there are plenty of reasons why you should have a data warehouse, it should be noted that there are a few negatives of having a data warehouse including the fact that it is time consuming to create and to keep operating. You might also have a problem with current systems being incompatible with your data. It is also important to consider future equipment and software upgrades; these may also need to be compatible with you data. Finally, security might be a huge concern, especially if your data is accessible over an open network such as the internet. You do not want your data to be viewed by your competitor or worse hacked and destroyed. What is Data Mining? Data mining is usually defined as searching, analyzing and sifting through large amounts of data to find relationships, patterns, or any significant statistical correlations. With the advent of computers, large databases and the internet, it is easier than ever to collect millions, billions and even trillions of pieces of data that can then be systematically analyzed to help look for relationships and to seek solutions to difficult problems. Besides governmental uses, many marketers use data mining to find strong consumer patterns and relationships. Large organizations and educational institutions also data mine to find significant correlations that can enhance our society. While data mining is amoral in the fact that it only looks for strong statistical correlations or relationships, it can be used for either good or not so good purposes. For instance, many government organizations depend on data mining to help them create solutions for many societal problems. Marketers use data mining to help them pin point and focus their attention on certain segments of the market to sell to, and in some cases black hat hackers can use data mining to steal and scam thousands of people. How does data mining work? Well the quick answer is that large amounts of data are collected. Usually most entities that perform data mining are large corporations and government agencies. They have been collecting data for decades and they have lots of data to sift through. If you are a fairly new business or individual, you can purchase certain types of data in order to mine for your own purposes. In addition, data can also be stolen from large depositories by hackers by hacking their way into a large database or simply stealing laptops that are ill protected. If you are interested in a small case study on how data mining is collected, used and profited off of, you can look at your local supermarket. Your supermarket is usually an extremely lean and organized entity

that relies on data mining to make sure that it is profitable. Usually your supermarket employs a POS (Point Of Sale) system that collects data from each item that is purchased. The POS system collects data on the item brand name, category, size, time and date of the purchase and at what price the item was purchased at. In addition, the supermarket usually has a customer rewards program, which also is input into the POS system. This information can directly link the products purchased with an individual. All this data for every purchase made for years and years is stored in a database in a computer by the supermarket. Now that you have a database with millions upon millions of data fields and records what are you going to do with it? Well, you data mine it. Knowledge is power and with so much data you can uncover trends, statistical correlations, relationships and patterns that can help your business become more efficient, effective and streamlined. The supermarket can now figure out which brands sell the most, what time of the day, week, month or year is the most busiest, what products do consumers buy with certain items. For instance, if a person buys white bread, what other item would they be inclined to buy? Typically we can find its peanut butter and jelly. There is so much good information that a supermarket can use just by data mining their own data that they have collected. What is Data Management? Data Management is a broad field of study, but essentially is the process of managing data as a resource that is valuable to an organization or business. One of the largest organizations that deal with data management, DAMA (Data Management Association), states that data management is the process of developing data architectures, practices and procedures dealing with data and then executing these aspects on a regular basis. There are many topics within data management, some of the more popular topics include data modeling, data warehousing, data movement, database administration and data mining. Data Modeling Data modeling is first creating a structure for the data that you collect and use and then organizing this data in a way that is easily accessible and efficient to store and pull the data for reports and analysis. In order to create a structure for data, it must be named appropriately and show a relationship with other data. It also must fit appropriately in a class. For instance, if you have a database of media, you might have a hierarchal structure of objects that include photos, videos, and audio files. Within each category, you can classify objects accordingly. Data Warehousing Data warehousing is storing data effectively so that it can be accessed and used efficiently. Different organizations collect different types of data, but many organizations use their data the same way, in order to create reports and analyze their data to make quality business decisions. Data warehousing is usually an organizational wide repository of data, however for very large corporations in can encompass just one office or one department. Data Movement Data movement is the ability to move data from one place to another. For instance, data needs to be moved from where it is collected to a database and then to an end user, but this process takes quite a bit of logistic insight. Not only do all hardware, applications and data collected need to be compatible with

one another, they must also be able to be classified, stored and accessed with ease within an organization. Moving data can be very expensive and can require lots of resources to make sure that data is moved efficiently, that data is secure in transit and that once it reaches the end user it can be used effectively either to be printed out as a report, saved on a computer or sent as an email attachment. Database Administration Database administration is extremely important in managing data. Every organization or enterprise needs database administrators that are responsible for the database environment. Database administrators are usually given the authority to do the following tasks that include recoverability, integrity, security, availability, performance and development & testing support. Recoverability is usually defined as a way to store data as a back up and then test the back ups to make sure that they are valid. The task of integrity means that data that is pulled for certain records or files are in fact valid and have high data integrity. Data integrity is extremely important especially when creating reports or when data is used for analysis. If you have data that is deemed invalid, your results will be worthless. Database security is an essential task for database administrators. For instance, database administrators are usually in charge of giving clearance and access to certain databases or trees in an organization. Another important task is availability. Availability is defined as making sure a database is up and running. The more up time, usually the higher level of productivity. Performance is related to availability, it is considered getting the most out of the hardware, applications and data as possible. Performance is usually in relation to an organizations budget, physical equipment and resources. Finally, a database administrator is usually involved in database development and testing support. Database administrators are always trying to push the envelope, trying to get more use out of the data and add better performing and more powerful applications, hardware and resources to the database structure. A database that is administered correctly is not only a sign of competent database administrator, but it also means that all end users have a huge resource in the data that is available. This makes it easy to create reports, conduct analysis and make high quality decisions based on data that is collected and used within the organization. Data Mining Another important topic regarding data management is data mining. Data mining is a process in which large amounts of data are sifted through to show trends, relationships, and patterns. Data mining is a crucial component to data management because it exposes interesting information about the data being collected. It is important to note that data is primarily collected so it can be used to find these patterns, relationships and trends that can help a business grow or create profit. While there are many topics within data management, they all work together from the beginning where data is collected to the end of the process where it is sifted through; analyzed and formatted where specialists can then make quality decisions based upon it. What is a Multimedia Database? A multimedia database is a database that hosts one or more primary media file types such as .txt (documents), .jpg (images), .swf (videos), .mp3 (audio), etc. And loosely fall into three main categories: * Static media (time-independent, i.e. images and handwriting) * Dynamic media (time-dependent, i.e. video and sound bytes)

* Dimensional media (i.e. 3D games or computer-aided drafting programs- CAD) All primary media files are stored in binary strings of zeros and ones, and are encoded according to file type. The term "data" is typically referenced from the computer point of view, whereas the term "multimedia" is referenced from the user point of view. Types of Multimedia Databases There are numerous different types of multimedia databases, including: * The Authentication Multimedia Database (also known as a Verification Multimedia Database, i.e. retina scanning), is a 1:1 data comparison * The Identification Multimedia Database is a data comparison of one-to-many (i.e. passwords and personal identification numbers * A newly-emerging type of multimedia database, is the Biometrics Multimedia Database; which specializes in automatic human verification based on the algorithms of their behavioral or physiological profile. This method of identification is superior to traditional multimedia database methods requiring the typical input of personal identification numbers and passwordsDue to the fact that the person being identified does not need to be physically present, where the identification check is taking place. This removes the need for the person being scanned to remember a PIN or password. Fingerprint identification technology is also based on this type of multimedia database. Difficulties Involved with Multimedia Databases The difficulty of making these different types of multimedia databases readily accessible to humans is: * The tremendous amount of bandwidth they consume; * Creating Globally-accepted data-handling platforms, such as Joomla, and the special considerations that these new multimedia database structures require. * Creating a Globally-accepted operating system, including applicable storage and resource management programs need to accommodate the vast Global multimedia information hunger. * Multimedia databases need to take into accommodate various human interfaces to handle 3Dinteractive objects, in an logically-perceived manner (i.e. SecondLife.com). * Accommodating the vast resources required to utilize artificial intelligence to it's fullest potentialincluding computer sight and sound analysis methods. * The historic relational databases (i.e the Binary Large Objects - BLOBs- developed for SQL databases to store multimedia data) do not conveniently support content-based searches for multimedia content. This is due to the relational database not being able to recognize the internal structure of a Binary Large Object and therefore internal multimedia data components cannot be retrieved... Basically, a relational database is an "everything or nothing" structure- with files retrieved and stored as a whole, which makes a relational database completely inefficient for making multimedia data easily accessible to humans.

In order to effectively accommodate multimedia data, a database management system, such as an Object Oriented Database (OODB) or Object Relational Database Management System (ORDBMS). Examples of Object Relational Database Management Systems include Odaptor (HP): UniSQL, ODBII, and Illustra. The flip-side of the coin, is that unlike non-multimedia data stored in relational databases, multimedia data cannot be easily indexed, retrieved or classified, except by way of social bookmarking and ranking-rating, by actual humans. This is made possible by metadata retrieval methods, commonly referred to as tags, and tagging. This is why you can search for dogs, as an example, and a picture comes up based on your text search term. This is also referred to a schematic mode. Whereas doing a search with a picture of a dog to locate other dog pictures is referred to as paradigmatic mode. However, metadata retrieval, search, and identify methods severely lack in being able to properly define uniform space and texture descriptions, such as the spatial relationships between 3D objects, etc. The Content-Based Retrieval multimedia database search method (CBR), however, is specifically based on these types of searches. In other words, if you were to search an image or sub-image; you would then be shown other images or sub-images that related in some way to your the particular search, by way of color ratio or pattern, etc.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close