The Object-Relational Impedance Mismatch

www.agiledata.org: The Object-Relational Impedance Mismatch

Bringing data professionals and application developers together.

by Scott W. Ambler, Copyright 2002-2004

This essay summarizes Chapter 7 of Agile Database Techniques. 

Object-oriented technology supports the building of applications out of objects that have both data and behavior.  Relational technologies support the storage of data in tables and manipulation of that data via data manipulation language (DML) internally within the database via stored procedures and externally via SQL calls.  Some relational databases go further and now support objects internally as well, a trend that will only grow stronger over time.  It is clear that object technologies and relational technologies are in common use in most organizations, that both are here to stay for quite awhile, and that both are being used together to build complex software-based systems.  It is also clear that the fit between the two technologies isn’t perfect, that there is an “impedance mismatch” between the two.

In the early 1990s the differences between the two approaches was labeled the “object-relational impedance mismatch”, or simply “impedance mismatch” for short, labels that are still in common use today.  Much of the conversation about the impedance mismatch focus on the technical differences between object and relational technologies, and rightfully so because there are significant differences.  Unfortunately there has been less attention spent on the cultural differences between the object-oriented community and the data community.  These differences are often revealed when object professionals and data professionals argue with each other regarding the approach that should be taken by a project team. 

 

Table of Contents

  • The Role of the Agile DBA
  • The Technological Impedance Mismatch  
  • Deceptive Similarities, Subtle Differences
  • The Cultural Impedance Mismatch  
  • Strategies for Overcoming the Object-Relational Impedance Mismatch
  • References

 

1. The Role of the Agile DBA

On the technical side it is the job of an Agile DBA to work with application developers to make object and relational technologies work together.  On the cultural side Agile DBAs will often find themselves in the role of mediator, typically between agile software developers and traditional data professionals.  In short, Agile DBAs act as bridges between both the object and data worlds as well as between the agile and traditional worlds.

 

2. The Technological Impedance Mismatch

Why does a technological impedance mismatch exist?  The object-oriented paradigm is based on proven software engineering principles.   The relational paradigm, however, is based on proven mathematical principles.  Because the underlying paradigms are different the two technologies do not work together seamlessly. 

The impedance mismatch becomes apparent when you look at the preferred approach to access: With the object paradigm you traverse objects via their relationships whereas with the relational paradigm you join the data rows of tables.  This fundamental difference results in a non-ideal combination of object and relational technologies, although when have you ever used two different things together without a few hitches? 

To succeed using objects and relational databases together you need to understand both paradigms, and their differences, and then make intelligent tradeoffs based on that knowledge.  Relational Databases 101 overviews relational databases and Data Modeling 101 describes the basics of data modeling, providing you with sufficient background to understand the relational paradigm.  Similarly Object-Orientation 101 overviews object-orientation and the UML, explaining the basics of the object-oriented paradigm.  Until you understand both paradigms, and gain real-world experience working in both technologies, it will be very difficult to see past the deceptive similarities between the two.

 

3. Deceptive Similarities, Subtle Differences

Figure 1 depicts a physical data model (PDM) using the unofficial UML data modeling notation.  Figure 2 depicts a UML class diagram.  On the surface they look like very similar diagrams, and on the surface they in fact are.  It’s how you arrive at the two diagrams that can be very different.

 

Figure 1. A physical data model (UML notation).

 

Let’s consider the deceptive similarities between the two diagrams.  Both diagrams depict structure, the PDM shows four database tables and the relationships between them whereas the UML class diagram shows four classes and their corresponding relationships.  Both diagrams depict data, the PDM shows the columns within the tables and the class model the attributes of the classes.  Both diagrams also depict behavior, the Customer table of Figure 1 includes a delete trigger and the Customer class of Figure 2 includes two operations.  The two diagrams also use similar notations, something that I did on purpose, although the UML data modeling notation is little different than other industry notations.  

Figure 2. A UML class model.

 

Differences in your modeling approaches will result in subtle differences between your object schema and your data schema:

  • By considering both data and behavior in the class diagram the modeler created a different structure than in the data model that only considered data 

  • Data normalization in data modeling versus class normalization in class modeling

  • The application of data analysis patterns (Hay 1996) versus object-oriented analysis patterns (Fowler 1997; Ambler 1997) and design patterns (Gamma et. al. 1995)

There are differences in the types of relationships that each model supports, with class diagrams being slightly more robust than physical data models for relational databases.  This is because of the inherent nature of the technologies.  For example, you see that there is a many-to-many relationship between Customer and Address in Figure 2, a relationship that was resolved in Figure 1 via the CustomerAddress associative table.  Object technology supports this type of relationship but relational databases do not, which is why the associative table was introduced. 

Figure 3 also reveals a schism within the object community.  It is common practice to not show keys on class diagrams (Ambler 2003), for example there isn’t any shown on Figure 2.  However, the reality is that when you are using a relational database to store your objects then each object must maintain enough information to be able to successfully write itself, and the relationships it is involved with, back out to the database.  This is something that I call “shadow information”, which you can see has been added in Figure 3 in the form of attributes with implementation visibility (no visibility symbol is shown).  For example the Address class now includes the attribute addressID which corresponds to AddressID in the Address table (the attributes customers, state, and zipCode are required to maintain the relationships to the Customer, State, and ZipCode classes respectively).

 

Figure 3. A fully attributed UML class model.

 

The schism is that the object community has a tendency to underestimate the importance of object persistence.  Symptoms of this problem include:

  • The lack of an official data model in the UML (see The Unofficial UML Data Modeling Profile)

  • The practice of not modeling keys on class diagrams

  • The misguided belief that you can model the persistent aspects of your system by applying a few stereotypes to a UML class diagram

  • Many popular OOA&D books spend little or no time discussing object persistence issues

Yet in reality object developers discover that they need to spend significant portions of their time making their object persistent, perhaps because they’ve run into performance problems after improper mappings or perhaps because they’ve discovered that they didn’t take legacy data constraints into account in their design.  My experience is that persistence is a significant blind spot for many object developers, one that promotes the cultural impedance mismatch discussed below. 

 

4. The Cultural Impedance Mismatch

The cultural impedance mismatch, something that I call the “object-data divide” (Ambler 2000a; Ambler 2000b), refers to the politics between the object community and the data community.  Specifically, these politics are the difficulties of object-oriented and data-oriented developers experience when working together, and generally to the dysfunctional politics between the two communities that occurs within IT organizations and even the IT industry itself.  Problems that the Agile Data (AD) method strives to overcome.  Symptoms of the object-data divide include object developers that claim relational technology either shouldn't or can't be used to store objects and data professionals that claim that your object/component models must be driven by their data models.  Like most prejudices, neither of these beliefs are even remotely based on fact: In Relational Databases 101 you saw that relational databases are used to store a wide range of data, including the data representing objects, and in Different Projects Require Different Strategies you learned that there are several ways to approach development in addition to a data-driven approach. 

To understand why our industry suffers from the object-data divide you need to consider the history of object technology.  Object technology was first introduced in the late 1960s and adopted by business community in the late 1980s and early 1990s – even now many organizations are just starting to use it for mission-critical software.  As with most other new technologies, there was spectacular hype surrounding objects at the start:  Everything is an object.  Object technology is a silver bullet that solves all of our problems.  Objects are easier to understand and to work with.  Object technology is the only thing that you'll ever need.  In time reality prevailed and these claims were seen for what they were, wishful thinking at best.  Unfortunately one bit of hype did serious damage, the idea that the pure approach supported by objectbases would quickly eclipse the "questionable" use of relational technologies.  This mistaken belief, combined with the findings of several significant research studies that showed that object techniques and structured techniques don't mix well in practice, led many within the object community to proclaim that objects and relational databases shouldn't be used together.

At the same time the data community was coming into its own.  Already important in the traditional mainframe world, data modelers found their role in the two-tier client server world (the dominant technology at the time for new application development) to be equally as critical.  Development in both of these worlds worked similarly: the data professionals would develop the data schema and the application developers would write their program code.  This worked because there wasn't a lot of conceptual overlap between the two tasks, data models showed the data entities and their relationships whereas the application/process models showed how the application worked with the data.  From the point of view of data professionals very little had changed in their world.  Then object technology came along.  Some data professionals quickly recognized that the object paradigm was a completely new way to develop software, I was among them, and joined the growing object crowd.   Unfortunately many data professionals either believed the object paradigm to be another fad doomed to fail or merely another programming technology and therefore remained content with what they perceived to be the status quo.

Unfortunately both communities got it wrong.  To the dismay of object purists objectbases never proved to be more than a niche technology, whereas relational databases have effectively become the defacto standard for storing data.  Furthermore, the studies of the late 80s and early 90s actually showed that you can't use structured models for object implementation languages such as C++ or Smalltalk, or object models for structured implementation languages such as COBOL or BASIC.  They didn't address the idea of melding object and structured modeling techniques in order to drive your efforts working with implementation technologies such as object programming languages and relational databases.  In fact, practice has shown that it is reasonably straightforward to map objects to relational databases. 

To the dismay of data professionals, object modeling techniques, particularly those of the Unified Modeling Language (UML), are significantly more robust than data modeling techniques and are arguably a superset of data modeling (Muller 1999).  The object approach had superceded the data approach, in fact there was such a significant conceptual overlap that many data professionals mistakenly believed that class diagrams were merely data models with operations added in because they hadn’t recognized the subtle differences.  What they didn't recognize is that the complexity of modeling behavior requires more than just class diagrams, hence the wealth of models defined by the UML, and that their focus on data alone was too narrow for the needs of modern application development.  Object techniques proved to work well in practice, not only isn't object technology a fad it has become the dominant development platform, and the status quo has changed to the point that most modern development methodologies devote more than a few pages to data modeling (to their detriment).

The object-data divide produces dire consequences:

  1. IT project teams fail to produce software on time and on budget. 

  2. The technical impedance mismatch is exacerbated. 

  3. Data models often prove to be poor drivers for object models. 

  4. Increased staff turnover. 

 

5. Strategies for Overcoming the Object-Relational Impedance Mismatch

Object and relational technologies are real, you are very likely working with both, and they are here to stay.  Unfortunately the two technologies differ, these differences being referred to as “the object-relational impedance mismatch”.  In this chapter you learned that there are two aspects to the impedance mismatch: technical and cultural. 

The technical mismatch can be overcome by ensuring that project team members, including both application developers and Agile DBAs, understand the basics of both technologies.  Furthermore, you should actively try to reduce the coupling that your database schema is involved with by encapsulating access to your database(s) as best you can, by designing your database well, and by keeping the design clean through database refactoring. 

Overcoming the cultural impedance mismatch is much more difficult.  Everyone needs to recognize that the problem exists and needs to be overcome.  Object and data professionals have different skills, different backgrounds, different philosophies, and different ways that they prefer to work.  Instead of finding ways to work together that takes advantages of these differences, many IT shops instead have chosen to erect communication and political barriers between the two groups of professionals.  These barriers must be removed, something that the adoption of the Agile Data (AD) method can help with.  An important first step is to recognize that different projects require different approaches, that one “process size” does not fit all, and to manage accordingly.  It isn’t sufficient for the data group to be right, or the application group to be right, they need to be right together.  Stop playing political games and instead find ways to work together.

 

6. References and Suggested Online Readings

List of References

You might find the following essays of interest:

  • Mapping Objects to Relational Databases

This book describes the philosophies and skills required for developers and database administrators to work together effectively on project teams following evolutionary software processes such as Extreme Programming (XP), the Rational Unified Process (RUP), Feature Driven Development (FDD), Dynamic System Development Method (DSDM), or The Enterprise Unified Process (EUP).  In March 2004 it won a Jolt Productivity award.
This book presents a full-lifecycle, agile model driven development (AMDD) approach to software development.  It is one of the few books which covers both object-oriented and data-oriented development in a comprehensive and coherent manner.  Techniques the book covers include Agile Modeling (AM), Full Lifecycle Object-Oriented Testing (FLOOT), over 30 modeling techniques, agile database techniques, refactoring, and test driven development (TDD).If you want to gain the skills required to build mission-critical applications in an agile manner, this is the book for you.

 

Let Us Help

Ronin International, Inc. continues to help numerous organizations to learn about and hopefully adopt agile techniques and philosophies.  We offer both consulting and training offerings, including Agile Database Techniques Training.  In addition we suggest that you visit the Agile Modeling Site and the Enterprise Unified Process (EUP) site.

You might find several of my books to be of interest, including The Object Primer, Agile Modeling, The Elements of UML 2.0 Style, and Agile Database Techniques.

For more information please contact Michael Vizdos at 866-AT-RONIN (U.S. number) or via e-mail ([email protected]).

 

Suggestion or Question?  

Page first posted: November 11 2002
Page last updated: April 1 2004

你可能感兴趣的:(object,database,uml,application,class,paradigms)