Practical Database Design Methodologyand Use of UML Diagrams406.426 Design & Analysis of Database SystemsJonghun [email protected] of Industrial EngineeringSeoul National University

chapter outline information system life cycle phases of database design UML diagrams Rational Rose other tools design tools2

IT as a key to successful business data is regarded as a corporate resource, and its management andcontrol is considered central to the effective working of anorganization more functions in organizations are computerized, increasing theneed to keep large volumes of data available in an up-to-theminute current state as the complexity of the data and applications grows, complexrelationships among the data need to be modeled and maintained there’s a tendency toward consolidation of information resources inmany organizations many organizations are reducing their personnel costs by letting theend-user perform business transactions3

characteristics of database systems data independence from changes in the underlying logicalorganization and in the physical access paths and storage structures external schemas that allow the same data to be used for multipleapplications integration of data across multiple applications into a single DB simplicity of developing new applications using high-levellanguages like SQL possibility of supporting casual access for browsing and querying bymanagers while supporting major production-level TP4

trends in DB systems personal DBs is gaining popularity Excel, MySQL, Access, check-out and check-in advent of distributed & client-server DBMSs for better local control and faster local processing emergence of Web-based applications using data dictionary systems (or information repositories) data about DB DB structure, constraints, applications, authorizations, performance-critical TP systems around-the-clock nonstop operation hundreds of transactions per min.5

information system life cycle feasibility analysis cost-benefit studies, setting up priorities, scopes, requirements collection and analysis interacting with potential users design design of DB system, design of application systems implementation validation and acceptance testing against performance criteria and behavior specifications deployment, operation and maintenance new requirements or applications crop up6

DB application system life cycle system definitionDB designDB implementationloading or data conversion time consuming application conversion time consuming testing and validation operation usually the old and the new systems are operated in parallel for sometime monitoring and maintenance7

database design problem definition: design the logical and physical structure of oneor more databases to accommodate the information needs of theusers in an organization for a defined set of applications goals satisfy the information content requirementsprovide a natural and easy-to-understand structuring of informationsupport processing requirements and any performance objectivestradeoff between “understandability” and “performance”8

phases of DB design and implementation9

phase 1: requirements collection and analysis major activities application areas and user groups are identified existing documentation concerning the applications is analyzed current operating environment and planned use of the information isstudied types of transactions and their frequencies, the flow of information,geographic characteristics, origin of transactions, destination of reports,input and output data for the transactions, written responses to sets of questions are sometimes collected from thepotential DB users requirements are subject to change! JAD (Joint Application Design) contextual design10

phase 1: requirements collection and analysis requirement specification techniques diagramming techniques OOA DFD formal specification methods e.g., Z hardly used upper CASE tools help check the consistency and completeness of specifications correcting a requirement error is much more expensive thancorrecting an error made during implementation11

phase 2: conceptual DB design involves two parallel activities: conceptual schema design andtransaction and application design conceptual schema design is DBMS-independent because complete understanding of the DB structure, semantics,interrelationships, and constraints can be best achieved independentlyof a specific DBMS choice of DBMS and later design decision may change high-level data model is more expressive and general than the datamodels of individual DBMS diagrammatic description of the conceptual schema can serve as anexcellent vehicle of communication among database users, designers,and analysts12

phase 2: conceptual DB design desired characteristics of a conceptual data model expressivenesssimplicity and understandabilityminimalitydiagrammatic representationformality the above characteristics usually result in conflicts output entity types, relationship types, attributes key attributes, cardinality and participation constraints on relationships,weak entity types, specialization/generalization hierarchies, 13

phase 2: conceptual DB design approaches to conceptual schema design centralized (or one-shot) schema design approach requirements of the different applications and user groups from Phase 1 aremerged into a single set of requirements before schema design begins single schema corresponding to the merged set of requirements is thendesigned view integration approach schema is designed for each user group or application based only on its ownrequirements during a subsequent view integration phase, the schemas are merged orintegrated into a global conceptual schema for the entire DB more popular14

phase 2: conceptual DB design strategies for schema design top-down strategy start with a schema containing high-level abstractions and then applysuccessive top-down refinements bottom-up strategy start with a schema containing basic abstractions and then combine or addto these abstractions inside-out strategy special case of a bottom-up strategy, where attention is focused on a centralset of concepts that are most evident modeling then spreads outward by considering new concepts in the vicinityof existing ones mixed strategy requirements are partitioned according to a top-down strategy, and part ofthe schema is designed for each partition according to a bottom-up strategy15

example of top-down refinement16

example of bottom-up refinement17

phase 2: conceptual DB design schema integration identifying correspondences and conflicts among the schemas naming conflicts: synonyms, homonymstype conflicts: e.g., entity vs. attributedomain conflictsconflicts among constraints modifying views to conform to one another merging of views involves a considerable amount of human intervention and negotiation toresolve conflicts restructuring to remove any redundancies and unnecessary complexity18

example of view modification (1)19

example of view modification (2)20

phase 2: conceptual DB design strategies for the view integration process binary ladder integration 2 schemas that are quite similar are integrated first N-ary integration all the views are integrated in one procedure binary balanced strategy pairs of schemas are integrated first, then the resulting schemas are pairedfor further integration mixed strategy schemas are partitioned into groups based on their similarity, and eachgroup is integrated separately21

different strategies for the view integration22

phase 2: conceptual DB design phase 2b: transaction design to design the functional characteristics of known DB transactions(applications) in a DBMS-independent way 80-20 rule: 80 % of the workload is represented by 20 % of the mostfrequently used transactions identifying the transaction’s I/O retrieval, update, and mixed transactions identifying the transaction’s functional behavior notation for specifying processes activities, events, operations, sequencing, synchronizations, still remains an active area of research23

phase 3: choice of a DBMS technical considerations type of DBMS, the storage structures and access paths, UI, APIs, thetypes of high-level languages, availability of development tools, abilityto interface with other DBMSs, architectural options related to CSoperation, DBMS portability nontechnical considerations financial status and the support organization of the vendor, availabilityof vendor services, organization-wide adoption of a certain philosophy,familiarity of personnel with the system economic considerations software acquisition cost, maintenance cost, hardware acquisition cost,DB creation and conversion cost, personnel cost, training cost, operatingcost24

phase 3: choice of a DBMS drivers for DBMS data complexity, data sharing among applications, dynamically evolvingor growing data, frequency of ad hoc requests for data, data volume andneed for control common built-in features of DBMSs text editors and browsersreport generators and listing utilitiescommunication softwaredata entry and display features such as forms, screens, and menus withautomatic editing features inquiry and access tools that can be used on WWW graphical DB design tools25

phase 4: data model mapping to create a conceptual schema and external schemas in the datamodel of the selected DBMS two stages system-independent mapping: e.g., EER - relational schemas tailoring the schemas to a specific DBMS result: DDL statements in the language of the chosen DBMS thatspecify the conceptual and external level schemas of the DB system many automated CASE design tools can generate DDL from aconceptual schema design26

phase 5: physical database design process of choosing specific storage structures and access pathsfor the DB files to achieve good performance for the various DBapplications usually include various types of indexing, clustering of relatedrecords on disk blocks, linking related records via pointers, andvarious types of hashing frequently used criteria response time elapsed time between submitting a DB transaction for execution andreceiving a response space utilization amount of storage space used by the DB files and their access pathstructures on disk transaction throughput average # of transactions processed per min cf. benchmark test27

phase 6: DB system implementation and tuning typically the responsibility of the DBA and is carried out inconjunction with the DB designers language statements in DDL including SDL of the selected DBMSare compiled and used to create the DB schemas and DB files DB can then be loaded (populated) with the data conversion routines may be needed DB transactions must be implemented by the applicationprogrammers, and then writing and testing program code withembedded DML commands28

UML as a design specification standard even though its concepts are based on object-oriented techniques, theresulting models of structure and behavior can be used to design bothrelational, object-oriented, and object-relational DBs UML defines 9 types of diagrams structural diagrams describe the structural or static relationships among components class diagram, object diagram, component diagram, and deploymentdiagram behavioral diagrams describe the behavioral or dynamic relationships among components use case diagram, sequence diagram, collaboration diagram, statechartdiagram, and activity diagram29

UML diagrams class diagrams capture the static structure of the system and act as foundation forother models show classes, interfaces, collaborations, dependencies, generalizations,association and other relationships object diagrams show a set of objects and their relationships correspond to instance diagrams component diagrams illustrate the organizations and dependencies among softwarecomponents consists of components, interfaces, and dependency relationships deployment diagrams represent the distribution of components across the hardware topology30

UML diagrams use case diagrams model the functional interactions between users and the system use case is a set of scenarios that have a common goal sequence diagrams describe the interactions between various objects over time give a dynamic view of the system by showing the flow of messages betweenobjects collaboration diagrams represent interactions between objects as a series of sequenced messages show objects as icons and number the messages statechart diagrams describe how an object’s state changes in response to external events show all the possible states an object can get into in its lifetime activity diagrams present a dynamic view of the system by modeling the flow of control fromactivity to activity can be considered as flowcharts with states31

use-case diagram notation32

example use case diagram33

sequence diagram notation34

example of a sequence diagram35

statechart diagram notation36

example of statechart diagram37

data modeling using Rational Rose reverse engineering create a conceptual data model based on the DB structure forward engineering generate the DDL in a specific DBMS from a data model conceptual design in UML notationsupported DBs: IBM DB2, Oracle, SQL server, Sybaseconverting logical data model to object model and vice versasynchronization between the conceptual design and the actual DBextensive domain supporteasy communication among design teams38

graphical data model in Rational Rose39

logical data model diagram in Rational Rose40

CASE tools provided facilities diagramming model mapping design normalization desired characteristics easy-to-use interfaceanalytical componentsheuristic componentstrade-off analysisdisplay of design resultsdesign verification41

excellent vehicle of communication among database users, designers, and analysts. 13 phase 2: conceptual DB design . class diagram, object diagram, component diagram, and deployment . and dependency relationships deployment diagrams represent the distribution of components across the hardware topology. 31 UML diagrams use case diagrams