Field (computer science)

In computer science, a field (data field) is a data element of a record.

In a relational database, data is arranged as sets of records, a.k.a. rows, where each consists of the same sequence of fields.

In object-oriented programming, an object is a record that consists of data and function fields.^[1]

Example

The following Java class has 3 fields: firstName, lastName, and age.

public class Person
{
	private String firstName;
	private String lastName;
	private int age;
}

Fixed vs. variable length

In row-based storage, fields are typically either fixed or variable length.

Fields that contain a fixed number of bits are known as fixed length fields. A four byte field for example may contain a 31 bit binary integer plus a sign bit (32 bits in all). A 30 byte name field may contain a person's name typically padded with blanks at the end. The disadvantage of using fixed length fields is that some part of the field may be wasted but space is still required for the maximum length case. Also, where fields are omitted, padding for the missing fields is still required to maintain fixed start positions within a record for instance.

A variable length field is not always the same physical size. Such fields are nearly always used for text fields that can be large, or fields that vary greatly in length. For example, a bibliographical database like PubMed has many small fields such as publication date and author name, but also has abstracts, which vary greatly in length. Reserving a fixed-length field of some length would be inefficient because it would enforce a maximum length on abstracts, and because space would be wasted in most records (particularly if many articles lacked abstracts entirely).

Database implementations commonly store varying-length fields in special ways, in order to make all the records of a given type have a uniform small size. Doing so can help performance. On the other hand, data in serialized forms such as stored in typical file systems, transmitted across networks, and so on usually uses quite different performance strategies. The choice depends on factors such as the total size of records, performance characteristics of the storage medium, and the expected patterns of access.

Database implementations typically store variable length fields in ways such as

a sequence of characters or bytes, followed by an end-marker that is prohibited within the string itself. This makes it slower to access later fields in the same record because the later fields are not always at the same physical distance from the start of the record.
a pointer to data in some other location, such as a URI, a file offset (and perhaps length), or a key identifying a record in some special place. This typically speeds up processes that do not need the contents of the variable length fields, but slows processes that do.
a length prefix followed by the specified number of characters or bytes. This avoids searches for an end-marker as in the first method, and avoids the loss of locality of reference as in the second method. On the other hand, it imposes a maximum length: the biggest number that can be represented using the (generally fixed length) prefix. In addition, records still vary in length, and must be traversed in order to reach later fields.

If a varying-length field is often empty, additional optimizations come into play.

References

^ "Data fields". Sliccware. Retrieved 2011-08-12.

[1] "Data fields". Sliccware. Retrieved 2011-08-12.

[1]

Example

Fixed vs. variable length

See also

References