Conversation
- Add `storage_name` column to `feed_audit` and `feed_audit_detail` - `main_repo_audit.pl` now requires a `--storage_name` parameter - Expected values `s3-truenas-ictc` and `s3-truenas-macc` - `HTFeed/Storage/LocalPairtree.pm` and `HTFeed/StorageAudit.pm` also write `feed_audit.storage_namd` and `feed_audit_detail.storage_name`, respectivelt, but I'm not sure what values that can or should provide - Add a couple of happy path tests for the script as a whole.
- New code is truenas_audit.pl - TODO: several FIXMEs in the code, possible leftovers from main_repo_audit.pl inside truenas_audit.pl
aelkiss
left a comment
There was a problem hiding this comment.
I've looked at this some; I want to look at the tests more, not sure I'll get to that today though.
aelkiss
left a comment
There was a problem hiding this comment.
Tests look good; see comments about the backticks. Not critical, but maybe worth considering.
- Refactor some of the path logic in `truenas_audit.t`
|
@aelkiss Since the scope of this has changed to focus on |
aelkiss
left a comment
There was a problem hiding this comment.
There are definitely some things here to think about in the future:
- how we use the checksums saved in
feed_storage - questions about
feed_auditgoing forward - in the future, merging this with
StorageAuditand making it more testable
I think this will get us what we need for now, and we can make some follow-up issues for the remaining issues.
| localtime( ( stat($metsfile) )[9] ) ); | ||
| } | ||
|
|
||
| my $last_touched = $zip_seconds; |
There was a problem hiding this comment.
$last_touched doesn't appear to be used
|
|
||
|
|
||
| #insert | ||
| execute_stmt( |
There was a problem hiding this comment.
Maybe wait to do this until after the file checks? Maybe worth comparing to the backup audits to see behavior -- that is, do we update lastchecked when we see it or after we verify some part of it?
| is($db_data->[0]->{storage_name}, $storage_name, 'correct storage_name'); | ||
| ok($db_data->[0]->{zip_size} > 0, 'nonzero zip_size'); | ||
| ok($db_data->[0]->{mets_size} > 0, 'nonzero mets_size'); | ||
| ok(!defined $db_data->[0]->{saved_md5sum}, 'not defined saved_md5sum'); |
There was a problem hiding this comment.
Do we want it to save this? Is it worth using if it's there? Probably fine for now to leave as-is, but maybe worth thinking about in the future or otherwise making it more consistent with what we do for the backup auditing.
| CREATE TABLE IF NOT EXISTS `feed_audit` ( | ||
| `namespace` varchar(10) NOT NULL, | ||
| `id` varchar(30) NOT NULL, | ||
| `storage_name` varchar(32) NOT NULL, |
There was a problem hiding this comment.
I think it makes sense to go ahead and roll back these changes
| my $stmt = | ||
| "insert into feed_audit (namespace, id, sdr_partition, zip_size, zip_date, mets_size, mets_date, lastchecked, lastmd5check, md5check_ok) \ | ||
| values(?,?,?,?,?,?,?,CURRENT_TIMESTAMP,CURRENT_TIMESTAMP,1) \ | ||
| "insert into feed_audit (namespace, id, storage_name, sdr_partition, zip_size, zip_date, mets_size, mets_date, lastchecked, lastmd5check, md5check_ok) \ |
There was a problem hiding this comment.
Likewise here.
However, we also need to think about once we turn off the old storage that uses LocalPairtree, we probably still want feed_audit populated? Maybe we need to mark one storage as "primary" for the purposes of feed_audit, or have some kind of view. Regardless, we don't need to do it right now in this PR.
storage_namecolumn tofeed_auditandfeed_audit_detailmain_repo_audit.plnow requires a--storage_nameparameters3-truenas-ictcands3-truenas-maccHTFeed/Storage/LocalPairtree.pmandHTFeed/StorageAudit.pmalso writefeed_audit.storage_namdandfeed_audit_detail.storage_name, respectivelt, but I'm not sure what values that can or should provide