@@ -1685,3 +1685,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
16851685 clear_survey_context (& ctx );
16861686 return 0 ;
16871687}
1688+
1689+ /*
1690+ * NEEDSWORK: The following is a bit of a laundry list of things
1691+ * that I'd like to add.
1692+ *
1693+ * [] Dump stats on all of the packfiles. The number and size of each.
1694+ * Whether each is in the .git directory or in an alternate. The state
1695+ * of the IDX or MIDX files and etc. Delta chain stats. All of this
1696+ * data is relative to the "lived-in" state of the repository. Stuff
1697+ * that may change after a GC or repack.
1698+ *
1699+ * [] Dump stats on each remote. When we fetch from a remote the size
1700+ * of the response is related to the set of haves on the server. You
1701+ * can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1702+ * payload that lists all of the branches and tags on the server, so
1703+ * at a minimum the RefName and SHA for each. But for annotated tags
1704+ * we also get the peeled SHA. The size of this overhead on every
1705+ * fetch is proporational to the size of the `git ls-remote` response
1706+ * (roughly, although the latter repeats the RefName of the peeled
1707+ * tag). If, for example, you have 500K refs on a remote, you're
1708+ * going to have a long "haves" message, so every fetch will be slow
1709+ * just because of that overhead (not counting new objects to be
1710+ * downloaded).
1711+ *
1712+ * Note that the local set of tags in "refs/tags/" is a union over all
1713+ * remotes. However, since most people only have one remote, we can
1714+ * probaly estimate the overhead value directly from the size of the
1715+ * set of "refs/tags/" that we visited while building the `ref_info`
1716+ * and `ref_array` and not need to ask the remote.
1717+ *
1718+ * [] Dump info on the complexity of the DAG. Criss-cross merges.
1719+ * The number of edges that must be touched to compute merge bases.
1720+ * Edge length. The number of parallel lanes in the history that must
1721+ * be navigated to get to the merge base. What affects the cost of
1722+ * the Ahead/Behind computation? How often do criss-crosses occur and
1723+ * do they cause various operations to slow down?
1724+ *
1725+ * [] If there are primary branches (like "main" or "master") are they
1726+ * always on the left side of merges? Does the graph have a clean
1727+ * left edge? Or are there normal and "backwards" merges? Do these
1728+ * cause problems at scale?
1729+ *
1730+ * [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1731+ * can we learn anything about the shape of the repo around these FI
1732+ * and RI integrations?
1733+ */
0 commit comments