Basic structures: rows and tables

Representing data in rows

Each real-world individual of a class (for example, each customer who does business with our enterprise) is represented by a row of information in a database table. The row is defined in the relational model as a tuple that is constructed over a given scheme. Mathematically, the tuple is a function that assigns a constant value from the attribute domain to each attribute of the scheme. Notice that because the scheme is a set of attributes, we could show them in any order without changing the meaning of the data in the row (tuple).

Customers row or tuple

Other views of this diagram: Large image - Description (text)

In formal notation, we could show the assignments explicitly, where t represents a tuple:

tTJ = ‹cfirstname := 'Tom', clastname := 'Jewett', cphone := '714-555-1212', cstreet := '10200 Slater', czipcode := '92708'›

In practice, when we create a table row in SQL, we are actually making the assignment of domain values to attributes, just as in the tuple definition.

        INSERT INTO customers
                 (cfirstname, clastname, cphone, cstreet, czipcode) 
          VALUES ('Tom', 'Jewett', '714-555-1212', '10200 Slater', '92708');

In SQL, you can omit the attribute names from the INSERT INTO statement, as long as you keep the comma-delimited list of values in exactly the same order that was used to create the table.

When we change the data in a table row using SQL, we are also following the tuple definition of assigning domain values to attributes.

        UPDATE customers 
          SET cphone = '714-555-2323' 
          WHERE cphone = '714-555-1212';

Tables

A database table is simply a collection of zero or more rows. This follows from the relational model definition of a relation as a set of tuples over the same scheme. (The name “relational model” comes from the relation being the central object in this model.)

Customers table or relation

Other views of this diagram: Large image - Description (text)

• Knowing that the relation (table) is a set of tuples (rows) tells us more about this structure, as we saw with schemes and domains.

- Each tuple/row is unique; there are no duplicates

- Tuples/rows are unordered; we can display them in any way we like and the meaning doesn’t change. (SQL gives us the capability to control the display order.)

- Tuples/rows may be included in a relation/table set if they are constructed on the scheme of that relation; they are excluded otherwise. (It would make no sense to have an Order row in the Customers table.)

- We can define subsets of the rows by specifying criteria for inclusion in the subset. (Again, this is part of a SQL query.)

- We can find the union, intersection, or difference of the rows in two or more tables, as long as they are constructed over the same scheme.

Insuring unique rows

Since each row in a table must be unique, no two rows can have exactly the same values for every one of their attributes. Therefore, there must be some set of attributes (it might be the set of all attributes) in each relation whose values, taken together, guarantee uniqueness of each row. Any set of attributes that can do this is called a super key (SK). Super keys are a property of the relation (table), filled in with any reasonable set of real-world data, even though we show them in the relation scheme drawing for convenience.

The database designer picks one of the possible super key attribute sets to serve as the primary key (PK) of the relation. (Notice that the PK is an SK, but not all SKs are PKs!) The PK is sometimes also called a unique identifier for each row of the table. This is not an arbitrary choice—we’ll discuss it in detail on a later page. For our customers table, we’ll pick the customer’s first name, last name, and phone number. We are likely to have at least two customers with the same first and last name, but it is very unlikely that they will both have the same phone number.

In SQL, we specify the primary key with a constraint on the table that lists the PK attributes. We also give the constraint a name that is easy for us to remember later (like “customers_pk” here).

        ALTER TABLE customers 
          ADD CONSTRAINT customers_pk
          PRIMARY KEY (cfirstname, clastname, cphone);

We also can specify the primary key when we create the table. The NOT NULL constraint prevents the PK attributes from being left empty, since NULL is a special constant in database systems that means “this field doesn’t have any value assigned to it.” It’s not the same as a zero length string or the number zero.

        CREATE TABLE customers (
          cfirstname    VARCHAR(20) NOT NULL,
          clastname     VARCHAR(20) NOT NULL,
          cphone        VARCHAR(20) NOT NULL,
          cstreet       VARCHAR(50),
          czipcode      VARCHAR(5)),
          CONSTRAINT customers_pk
            PRIMARY KEY (cfirstname, clastname, cphone);