Introduction[edit | edit source]
Distributed Database test
A distributed database is a database in which storage devices are not all attached to a common CPU. It may be stored in multiple computers located in the same physical location, or may be dispersed over a network of interconnected systems.[edit | edit source]
Distributed Database Management System:
A distributed database management system (DDBMS) manages the database as if it were not all stored on the same computer. The DDBMS synchronizes all the data periodically and, in cases where multiple users must access the same data, ensures that updates and deletes performed on the data at one location will be automatically reflected in the data stored elsewhere.
The users and administrators of a distributed system, should, with proper implementation, interact with the system as if the system was centralized. This transparency allows for the functionality desired in such a structured system without special programming requirements, allowing for any number of local and/or remote tables to be accessed at a given time across the network.
The different types of transparency sought after in a DDBMS are data distribution transparency, heterogeneity transparency, transaction transparency, and performance transparency.
Data distribution transparency requires that the user of the database should not have to know how the data is fragmented (fragmentation transparency), know where the data they access is actually located (location transparency), or be aware of whether multiple copies of the data exist (replication transparency).
Heterogeneity transparency requires that the user should not be aware of the fact that they are using a different DBMS if they access data from a remote site. The user should be able to use the same language that they would normally use at their regular access point and the DDBMS should handle query language translation if needed.
Transaction transparency requires that the DDBMS guarantee that concurrent transactions do not interfere with each other (concurrency transparency) and that it must also handle database recovery (recovery transparency).
Performance transparency mandates that the DDBMS should have a comparable level of performance to a centralized DBMS. Query optimizers can be used to speed up response time.
4 Types of DDB Design
DDBMS Advantages and Disadvantages
Distributed database management systems deliver several advantages over traditional systems. That being said, they are subject to some problems.
Advantages of DDBMS's
- Reflects organizational structure
- Improved share ability
- Improved availability
- Improved reliability
- Improved performance
- Data are located nearest the greatest demand site and are dispersed to match business requirements.
- Faster Data Access because users only work with a locally stored subset of the data.
- Faster data processing because the data is processed at several different sites.
-Growth Facilitation: New sites can be added without compromising the operations of other sites.
-Improved communications because local sites are smaller and closer to customers.
- Reduced operating costs: It is more cost-effective to add workstations to a network rather than update a mainframe system.
- User Friendly interface equipped with an easy-to-use GUI.
- Less instances of single-point failure because data and workload are distributed among other workstations.
- Processor independence: The end user is able to access any available copy of data.
Disadvantages of DDBMS's
- Increased Cost
-Integrity control more difficult,
-Lack of standards,
-Database design more complex.
- Complexity of management and control. Applications must recognize data location and they must be able to stitch together data from various sites.
- Technologically difficult: Data integrity, transaction management, concurrency control, security, backup, recovery, query optimization, access path selection are all issues that must be addressed and resolved
- Security lapses have increased instances when data are in multiple locations.
- Lack of standards due to the absence of communication protocols can make the processing and distribution of data difficult.
- Increased storage and infrastructure requirements because multiple copies of data are required at various separate locations which would require more disk space.
- Increased costs due to the higher complexity of training.
- Requires duplicate infrastructure (personnel, software and licensing, physical location/environment) and these can sometimes offset any operational savings.