Design pattern: recursive associations

A recursive association connects a single class type (serving in one role) to itself (serving in another role).

Example: In most companies, each employee (except the CEO) is supervised by one manager. Of course, not all employees are managers. This example is used in almost every database textbook, since the association between employees and managers is relatively easy to understand. Perhaps the best way to visualize it is to start with two class types:

Incorrect model

Recursive association (wrong)

• The problem with this model is that each manager is also an employee. Building the second (manager) table not only duplicates information from the employee table, it virtually guarantees that there will be mistakes and conflicts in the data. We can fix the problem by eliminating the redundant class and re-drawing the association line.

Correct model

Recursive association (corrected)

• Normally, we wouldn’t show an fk in the class diagram; however, including the manager as an attribute of the employee here (in addition to the association line) can help in understanding the model. In the relation scheme, we can explicitly show the connection between the surrogate pk (employeeID) and the managerID (which is an fk, even though it is in the same scheme).

Recursive relation scheme

In some project-oriented companies, an employee might work for more than one manager at a time. We also might want to keep a history of the employees’ supervision assignments over time. We can model either case by revising the class diagram to a many-to-many pattern:

Recursive association (many-to-many)

• The relation scheme for this model looks exactly like other many-to-many applications, with the exception that both foreign keys come from the same pk table.

Recursive relation scheme (many-to-many)

Retrieving data

To produce a list of employees and their managers, we have to join the employees table to itself, using two different aliases for the table name. An outer join is needed to include any employee who is not managed by anyone.

        SELECT E.lastName AS "Employee", M.lastName AS "Manager"
        FROM Employees E LEFT OUTER JOIN Employees M 
          ON E.managerID = M.employeeID
        ORDER BY E.lastName

In effect, the SQL statement works as if there were two copies of the employees table, as in the first (incorrect) UML diagram. You can visualize rows being joined this way:

Recursive table rows

The many-to-many structure is handled similarly in SQL. (Note the explicit ordering of the joins specified by parentheses.):

        SELECT E.lastName AS "Employee", M.lastName AS "Manager"
        FROM Employees E LEFT OUTER JOIN 
          (Supervisions S INNER JOIN Employees M 
            ON S.managerID = M.employeeID)
          ON E.employeeID = S.employeeID
        ORDER BY E.lastName