Commits


fix topolodical load test failure Clear the set of known-traversed objects before building the graph for the next commit's root-tree. Otherwise our tree graphs will be incomplete, as objects referred to via multiple commits would only be stored in the graph of one particular root-tree.


initialize 'counts' earlier to avoid spurious errors from tests If self.counts is not initialized then failing tests may produce an additional error about self.counts not existing, rather than failing on actual test assertions.


clear lists of added vertices


make graph file format somewhat configurable


debug


improve tree crawl efficiency and tweak progress display


avoid compression


store trees in subgraphs which can be swapped out to a temporary file This should reduce memory requirements significantly. Storing all trees of all commits with all tree entries in a single igraph uses too much memory to load repositories such as git.git.


avoid multiple lookups of the same ID in the pack index


use a bitstring to keep track of traversed object IDs The aim is to reduce the memory footprint of object graph construction


try using a set() instead of commits.keys() to track traversed commits


don't add graph edges to parent commits missing from pack file


remove incorrect check for missing objects


skip tree entries which are not in the pack file


tweak log level of debug message


attempt to improve debug progress output


tweak phrasing of a debug message


make debug prints less frequent


show debug progress output while searching blobs and trees


handle directly referenced blobs and trees


prevent history walks from takimg too long with many merge commits


raise log level of final pack index progress message


show 10 times less lines of pack index progress


fix list vs. set type confusion


remove debug prints related to tags