Design pattern: subclasses

Top down design

As you are developing a class diagram, you might discover that one or more attributes of a class are characteristics of only some individuals of that class, but not of others. This probably indicates that you need to develop a subclass of the basic class type. We call the process of designing subclasses from “top down” specialization; a class that represents a subset of another class type can also be called a specialization of its parent class.

Example: we will model the graduate students at a university. Some are employed by the university as teaching associates (TAs); some are employed as research associates (RAs); some are not employed by the university at all. For the TAs, we need to know which course they are assigned to teach; for the RAs, we need to know the grant number of the research project to which they are assigned. A first listing of the student attributes might look like this:

Graduate student class diagram

• At least one of the two unique attributes will always be null; frequently they both will be null. We can fix the problem by showing two specialized classes of students: TAs and RAs. The UML symbol for subclass association is an open arrowhead that points to the parent class.

Graduate student with subclasses

• Unique attributes are now contained in the subclass types. Attributes that are common to all students remain in the superclass (parent).

• The verbs to describe a subclass association are implied by the diagram. In this case, we would say that each grad student may be either a TA, an RA, or neither; each TA or RA is a grad student.

Specialization constraints

Rather than the usual cardinality/multiplicity symbols, the subclass association line is labeled with specialization constraints. Constraints are described along two dimensions: incomplete versus complete, and disjoint versus overlapping.

• In an incomplete specialization, also called a partial specialization, only some individuals of the parent class are specialized (that is, have unique attributes). Other individuals of the parent class have only the common attributes.

• In a complete specialization, all individuals of the parent class have one or more unique attributes that are not common to the generalized (parent) class.

• In a disjoint specialization, also called an exclusive specialization, an individual of the parent class may be a member of only one specialized subclass.

• In an overlapping specialization, an individual of of the parent class may be a member of more than one of the specialized subclasses.

Relation scheme diagram

We create a table for each of the subclasses, linked to the parent class with a pk-fk pair as always. Since the relationships are one-to-one, only the fk is needed to form the pk of the subclass table. There is no way to enforce the specialization constraints in the table structure—this has to be done by the data entry system. Notice that there is no attribute in the parent table to tell us if a student is a TA, an RA, or neither—the union of two outer join queries will produce a table with all of the information that we need.

Graduate student relation scheme

Bottom up design

Sometimes, instead of finding unique attributes in a single class type, you might find two or more classes that have many of the same attributes. This probably indicates that you need to develop a superclass of the classes with common attributes. We call the process of designing subclasses from “bottom up” generalization; a class or entity that represents a superset of other class types can also be called a generalization of the child types. Note: if you have two or more class types with exactly the same set of attributes, you probably have only one class type instead of many!

Example (thanks to Martin Malolepszy): A student of mine had a summer job with a brush-clearing service. This is a fairly specialized business but an essential one in southern California, where dried plant growth (brush) can present a severe fire hazard if it is not cleared from around houses and other structures. In addition to his exhausting physical work, Martin built a small database to help the owner manage this business.

• One important class type was the lot (or property) to be cleared. Some lots were in the city, with a standard street-and-number address. Other lots were not on a city street, but were described by the county surveyor's section and tract number. It seemed as if there were two class types:

City and county lots class diagram

• Actually, a few of the lots were identified by both address schemes. A closer look at the city and county lot classes also shows two common descriptive attributes (the owner and the lot size). The common attributes should go in a generalization or superclass that is simply called a “lot.” The relation scheme is identical in structure to the previous example.

City and county lots with superclass

City and county lots relation scheme