When subjects are numberedsequentially within each site, it means that thesubject identification numbers (Subject IDs)restart from 001 at each site. For example, Site 101 may have Subject 001, and Site 102 may also have a Subject 001. In such cases, thesubject number alone is not globally uniqueacross the entire study. Therefore, when integrating or joining data across multiple database tables (for example, linking demographic, adverse event, and laboratory data), both thesite number and the subject numberare required to create a unique key that accurately identifies each record.
According to theGood Clinical Data Management Practices (GCDMP, Chapter on CRF Design and Data Collection), every data record in a clinical trial database must be uniquely and unambiguously identified. This is typically achieved through acomposite key, combining identifiers such assite number,subject number, and sometimesstudy number. The GCDMP specifies that a robust data structure must prevent duplication or mislinking of records across domains or tables.
Furthermore,FDA and CDISC standards (SDTM model)also emphasize the importance ofunique subject identifiers (USUBJID), which are derived from concatenating the study ID, site ID, and subject ID. This ensures traceability, integrity, and accuracy of subject-level data during database joins, data exports, and regulatory submissions.
Thus, in the described scenario, since subject numbering restarts at each site,both the site number and subject numberare required to uniquely identify and correctly join subject data across different datasets or tables.
Reference (CCDM-Verified Sources):
SCDM Good Clinical Data Management Practices (GCDMP), Chapter: CRF Design and Data Collection, Section 4.1 – Unique Subject Identification
CDISC SDTM Implementation Guide, Section 5.2 – Subject and Site Identification (Variable: USUBJID)
FDA Guidance for Industry: Computerized Systems Used in Clinical Investigations, Section 6 – Data Integrity and Record Identification