Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Git LFS from repo #10

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Remove Git LFS from repo #10

wants to merge 4 commits into from

Conversation

orionw
Copy link
Collaborator

@orionw orionw commented Jul 7, 2024

To avoid bandwidth issues with Github's 1GB of Git-LFS bandwidth, remove files that use Git LFS:

  • Remove the videos to a separate HF space that we can just download in. I assume this won't change much so it's okay if we view them as resources to download in
  • Remove pickle files (since all binary files have to be git lfs) and use jsonl files instead.

@Muennighoff @isaac-chung Can I remove the index_*/passages.*.pt files from Github LFS? I assume not yet and when the other indexes are ready we can remove them, so just lmk. Those are the last git lfs files.

@@ -7,6 +7,16 @@
from models import ModelManager
from ui import build_side_by_side_ui_anon, build_side_by_side_ui_anon_sts, build_side_by_side_ui_anon_clustering, build_side_by_side_ui_named, build_side_by_side_ui_named_sts, build_side_by_side_ui_named_clustering, build_single_model_ui, build_single_model_ui_sts, build_single_model_ui_clustering


# download the videos
from huggingface_hub import hf_hub_url
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At runtime we download the videos from Huggingface. No need to keep them in this repo as I assume they are static and people won't be iterating on them.

@@ -31,8 +32,7 @@ def main(
if key in model_info[model]:
model_info[model][RENAME_KEYS[key]] = model_info[model].pop(key)

with open(elo_rating_pkl, "rb") as fin:
elo_rating_results = pickle.load(fin)
elo_rating_results = load_results(elo_rating_folder)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results are now saved and loaded as folders, but it does make the commits quite long as each dataframe is a separate file... sorry!

@@ -4,3 +4,4 @@ gritlm
mteb
plotly
umap-learn
kaleido
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently is needed to write Plotly plots to file. We could not save to file, but it was saved in the pickle files, so I thought we might as well write it to file for now.

Copy link
Contributor

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it looks good! Just wondering if we can simplify the structure a bit.

|-- elo_results_TASK
  |-- anony
-   |-- average_win_rate_bar
-      |-- default.png
+   |-- average_win_rate_bar.png
  |-- full

Comment on lines 3 to +4
mkdir -p results

mkdir -p results/latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't need mkdir -p results since we have the new line?

@isaac-chung isaac-chung mentioned this pull request Jul 7, 2024
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants