Skip to content
This repository was archived by the owner on Jun 15, 2024. It is now read-only.

Conversation

@nh916
Copy link
Contributor

@nh916 nh916 commented Nov 30, 2022

Script to generate Material Sheet controlled vocabulary

Description

This is the implementation that generates all the options for the Excel Uploader material sheet.

This code is only for the material sheet because the other sheets do not have nesting and can just be copied and pasted from the online controlled vocabulary.

How it works

It works by:

  1. Getting the controlled vocabulary from CRIPT by copying each part and pasting it into the source.xlsx sheets
  2. material_sheet_keys.py generates all the options by:
    1. reading each sheet of the source.xlsx file
    2. converting each sheet into a df
    3. running an algorithm to generate all the controlled vocabulary options up to 2 levels deep eg property:condition

    Note: It does not include any ids for the options eg [1]property:condition

  3. After creating all the options, then it outputs it into a new file called utils/excel_files/output.xlsx
  4. The developer can then copy the options generated in the output.xlsx and paste them into the hidden sheet of the CRIPT_template.xlsx

nh916 added 30 commits September 8, 2022 10:12
get_preferred_unit method works

single_options can return a successful df
added some comments and doc strings for clarity
…bles, changed instruction for uncertainty_type
@nh916 nh916 requested a review from CVilla17 November 30, 2022 02:16
Copy link
Contributor

@CVilla17 CVilla17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A while back I learned that it's not great practice to keep an excel file in your master branch of code if you're using github because no matter when you clone the repo that file is downloaded in some regard. For example, if later on we got rid of that file or replace it with a different one, you would still have to download the old file as part of the repo's history. The code is great and helpful, but I would suggest keeping the file in the organizations drive and calling on it locally whenever we want to run this.

brili
brili previously approved these changes Dec 1, 2022
@brili
Copy link
Contributor

brili commented Jan 3, 2023

@nh916 does this still include a manual step of copy-paste the controlled vocab? we should aim to eliminate the manual process

nh916 added 3 commits January 3, 2023 13:13
fixed process_source.xlsx and material_source.xlsx to have correct vocabulary
renamed because it is for both process sheet and material sheet
@nh916
Copy link
Contributor Author

nh916 commented Jan 4, 2023

@brili it does. If I can make the request to an endpoint to get a JSON of keys I think that could help eliminate this issue. However, I think there is a bigger problem with this solution in that the options are a lot and Excel cannot handle it and is often crashing

@nh916
Copy link
Contributor Author

nh916 commented Jan 4, 2023

@CVilla17 hmmm, I dont think that would happen from my experience but if you have an example or something let me know and I can look more into it

@CVilla17
Copy link
Contributor

CVilla17 commented Jan 4, 2023

@CVilla17 hmmm, I dont think that would happen from my experience but if you have an example or something let me know and I can look more into it

From what I've read online people suggest using something called Git LFS for things like excel files, especially if they are over 100MB

@nh916
Copy link
Contributor Author

nh916 commented Jan 4, 2023

@CVilla17 yeah I've read about that too, but I think that is only for Large File Storage because Git will reject pushes that are more than 100MB. I think the best test for this is to create a separate repo and try to insert an Excel file and then remove it and see if it still happens

@nh916 nh916 marked this pull request as draft January 9, 2023 17:57
@nh916
Copy link
Contributor Author

nh916 commented Mar 20, 2023

@bearmit do you have the endpoints for how to get the controlled vocabulary?

@bearmit
Copy link

bearmit commented Mar 20, 2023

@bearmit do you have the endpoints for how to get the controlled vocabulary?

This part is not complete. I just know the base URL is like http://development.api.criptapp.org/api/v1/cv/

EDIT: for info, URL is node oriented, not key oriented: example https://development.api.criptapp.org/api/v1/cv/material/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants