Golden Standard Methodology
Purpose The golden_standard methodology provides comprehensive validation through explicit, manual verification of all input data and output results.
Scope This methodology extends basic_check with rigorous manual verification of:
All
pdf_blksandtxt_blksinput filesComplete data extraction verification
Explicit confirmation that no data is missing from results
Covered File Types
- All files covered by basic_check methodology
- Additional verification of input block files (pdf_blks, txt_blks)
Protocol Steps 1. Complete basic_check protocol (all steps) 2. Input Block Verification:
Manually review every
pdf_blkandtxt_blkfileVerify that all relevant data from input blocks is captured in
results.pklExplicitly confirm no data is omitted or incorrectly parsed
Log Analysis: - Verify that
.log.csvcontains only true anomalies or is empty - Each warning in the log must be reviewed and confirmed as legitimateComprehensive Data Validation: - Cross-reference every data point in output files with source blocks - Verify data consistency across all output formats (CSV, YAML, pickle)
Edge Case Verification: - Check handling of unusual data formats or structures - Verify error handling and logging for problematic inputs
Trust Level This methodology provides strong certification - it represents thorough manual verification that all data is correctly processed and no information is lost during extraction.
Applicable Context Use this methodology for: - Release candidate validation - Critical data processing verification - Situations requiring high confidence in data accuracy - Validation of core algorithm functionality
Relationship to basic_check The golden_standard methodology includes and extends all basic_check requirements, providing a superset of validation guarantees.