There are four categories of PDI: Reference Information, Provenance, Context, and Fixity.
Provenance Information: The information that documents the history of the Content Information. This information tells the origin or source of the Content Information, any changes that may have taken place since it was originated, and who has had custody of it since it was originated. Examples of Provenance Information are the principal investigator who recorded the data, and the information concerning its storage, handling, and migration.
Reference Information: The information that identifies, and if necessary describes, one or more mechanisms used to provide assigned identifiers for the Content Information. It also provides identifiers that allow outside systems to refer, unambiguously, to a particular Content Information. An example of Reference Information is an ISBN.
Context Information: The information that documents the relationships of the Content Information to its environment. This includes why the Content Information was created and how it relates to other Content Information objects.
Fixity Information: The information which documents the authentication mechanisms and provides authentication keys to ensure that the Content Information object has not been altered in an undocumented manner. An example is a Cyclical Redundancy Check (CRC) code for a file.
Some examples of PDI:
Checksum - A simple error-detection scheme in which each transmitted message is accompanied by a numerical value based on the number of set bits in the message. The receiving station then applies the same formula to the message and checks to make sure the accompanying numerical value is the same. If not, the receiver can assume that the message has been garbled. -- Webopedia Computer Dictionary Online
Cyclical Redundancy Check (CRC) - Short for cyclic redundancy check, a common technique for detecting data transmission errors. Transmitted messages are divided into predetermined lengths that are divided by a fixed divisor. According to the calculation, the remainder number is appended onto and sent with the message. When the message is received, the computer recalculates the remainder and compares it to the transmitted remainder. If the numbers do not match, an error is detected. -- Webopedia Computer Dictionary Online
In the OAIS model, a checksum or a CRC would be used during the ingest process to guarantee that incoming files have not been corrupted during transfer.
Uniform Resource Identifier (URI) - Also known as a "persistent identifier," a URI is a unique string of characters assigned to incoming SIP files to link them together and at the same time, keep them distinct from other files. For electronic journal submissions, any A/V/I files embedded or linked to the e-journal articles would be assigned their own URIs. DSpace uses a proprietary Handle system developed by the Corporation for National Research Initiatives (CNRI) to assign its URIs.