Commits


remove redundant check: we already know that parent is in the pack


fix missing commit parent edges in commit graph


list missing trees more efficiently


show progress output while searching through packed trees


use a faster way to find all blob object IDs


cache bitarrays for recursive tree dependencies


progress reporting improvements


swap tree entry bitarrays out to disk if needed


compress bitarrays which represent tree entries


reduce debug log noise again


switch from graphs to bitarrays for tree entries


switch from bitstring to using bitarray directly


fix topolodical load test failure Clear the set of known-traversed objects before building the graph for the next commit's root-tree. Otherwise our tree graphs will be incomplete, as objects referred to via multiple commits would only be stored in the graph of one particular root-tree.


initialize 'counts' earlier to avoid spurious errors from tests If self.counts is not initialized then failing tests may produce an additional error about self.counts not existing, rather than failing on actual test assertions.


clear lists of added vertices


make graph file format somewhat configurable


debug


improve tree crawl efficiency and tweak progress display


avoid compression


store trees in subgraphs which can be swapped out to a temporary file This should reduce memory requirements significantly. Storing all trees of all commits with all tree entries in a single igraph uses too much memory to load repositories such as git.git.


avoid multiple lookups of the same ID in the pack index


use a bitstring to keep track of traversed object IDs The aim is to reduce the memory footprint of object graph construction


try using a set() instead of commits.keys() to track traversed commits


don't add graph edges to parent commits missing from pack file


remove incorrect check for missing objects