Conversation
|
see also issue #1 |
ctgraham
left a comment
There was a problem hiding this comment.
Some questions and some change requests.
make-batch-dirs
Outdated
| logger.addHandler(handler) | ||
| return logger | ||
|
|
||
| logger = logging # change as needed |
There was a problem hiding this comment.
"logger" should be global so it can be used throughout the code.
There was a problem hiding this comment.
I've moved the initiation of logger into the main function and made it a global variable so that it can be referenced in all functions.
make-batch-dirs
Outdated
| def create_batch_folder(scanning_path, batch_name): | ||
| batch_path = scanning_path+"/"+batch_name | ||
| logger.info(f"Creating Batch Path: {batch_path}") | ||
| print(f"Creating Batch Path: {batch_path}") |
There was a problem hiding this comment.
Why output to STDOUT and to the logger with duplicate content? If this is important, wrap these calls in a helper function to keep the messages DRY.
There was a problem hiding this comment.
This is a stopgap solution I did for figuring out what to do with the print statements in general. Ideally, log statements should be sufficient, but this script might be used within a pipeline, where logs would be better, or used as a cli tool where print statements (or stdout in general) would be more readable
There was a problem hiding this comment.
I addressed this here. This was done by adding a console logger along with a file logger that both are called at the same time. Might be able to enhance this more but would take a bit more looking into.
make-batch-dirs
Outdated
| return batch_path | ||
|
|
||
| def copy_xslx_to_batch(batch_path): | ||
| if os.path.isfile(batch_path+"/manifest.xlsx"): |
There was a problem hiding this comment.
Set the batch_path plus filename to a variable to keep this code DRY.
There was a problem hiding this comment.
This is still very repetitive, but with notable (unintentional) distinctions:
if os.path.isfile(os.path.sep.join([batch_path,'manifest.xlsx'])):
logger.info(f"Copying spreadsheet to {batch_path}/manifest.xlsx")
shutil.copyfile(args.xls_file, os.path.sep.join(batch_path,'manifest.xlsx'))
else:
logger.error(f"Error: {batch_path}/manifest.xlsx does not exist.")
In the if conditional, the arguments are wrapped in an array; in the copyfile directive, the arguments are passed directly. In the logger calls, manual concatenation is done.
If the name manifest.xml ever changes, we have many, many edits to make.
Cleaner would be:
MANIFEST_FILE_PATH = os.path.sep.join([batch_path,'manifest.xlsx'])
if os.path.isfile(MANIFEST_FILE_PATH):
logger.info(f"Copying spreadsheet to {MANIFEST_FILE_PATH}")
shutil.copyfile(args.xls_file, MANIFEST_FILE_PATH)
else:
logger.error(f"Error: {MANIFEST_FILE_PATH} does not exist.")
Bonus, move 'manifest.xml' to a module level constant.
There was a problem hiding this comment.
I'm working on cleaning this up a bit. I did miss a few of the os.path.sep.join statements which I've caught all of those up. Moving to the "Cleaner" version is on the TODO list.
There was a problem hiding this comment.
I made the "Cleaner" version commit update here.
make-batch-dirs
Outdated
| df = sheet.read() | ||
| batch_path = create_batch_folder(scanning_path, batch_name) | ||
| # Make sure the df has an 'id' column and data rows | ||
| if ('id' in df.columns): |
There was a problem hiding this comment.
Why check the existence of the id column and number of rows here, but not for the XLSX pathway? Could this be improved by generating the dataframe regardless of the source, and then calling the same logic to check the id, length, and make_dirs_from_df?
There was a problem hiding this comment.
iirc dataframe is generated in both cases. The check for id column and other error checking can be put inside make_dirs_from_df and then there should be parity
sheetutils.py
Outdated
| } | ||
|
|
||
| # Update the sheet with DataFrame contents | ||
| result = sheet.values().update( |
There was a problem hiding this comment.
Are we happy with throwing all exceptions directly to the caller here? If so, this should be documented, and the caller should have a try.
…ment, when before it was inherited from parent namespace
Remove specific file patterns from .gitignore.
Updated path handling to use os.path.sep for cross-platform compatibility.
Removed commented-out print statement and condition check related to debugging.
…sing into main function. Added a --log-file option.
…pandas call works better.
|
Tested the script to use either a spreadsheet or Google Sheet. currently the script functions as expected. |
Merging google sheets logic