You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
""" Sort a dict containing paths parts (ie, paths divided in parts and stored as a list). Top paths will be given precedence over deeper paths. """
71
+
# Find the path that is the deepest, and count the number of parts
72
+
max_rec=max(len(x) ifxelse0forxind.values())
73
+
# Pad other paths with empty parts to fill in, so that all paths will have the same number of parts (necessary to compare correctly, else deeper paths may get precedence over top ones, since the folder name will be compared to filenames!)
74
+
forkeyind.keys():
75
+
ifd[key]:
76
+
d[key] = ['']*(max_rec-len(d[key])) +d[key]
77
+
# Sort the dict relatively to the paths alphabetical order
78
+
d_sort=sorted(d.items(), key=lambdax: x[1])
79
+
returnd_sort
80
+
69
81
defsort_group(d, return_only_first=False):
70
82
''' Sort a dictionary of relative paths and cluster equal paths together at the same time '''
71
83
# First, sort the paths in order (this must be a couple: (parent_dir, filename), so that there's no ambiguity because else a file at root will be considered as being after a folder/file since the ordering is done alphabetically without any notion of tree structure).
72
-
d_sort=sorted(d.items(), key=lambdax: x[1])
84
+
d_sort=sort_dict_of_paths(d)
73
85
# Pop the first item in the ordered list
74
86
base_elt= (-1, None)
75
87
while (base_elt[1] isNoneandd_sort):
@@ -219,7 +231,9 @@ def majority_vote_byte_scan(relfilepath, fileslist, outpath, blocksize=65535, de
''' Main function to synchronize files contents by majority vote
222
-
The main job of this function is to walk through the input folders and align the files, so that we can compare every files across every folders, one by one.'''
234
+
The main job of this function is to walk through the input folders and align the files, so that we can compare every files across every folders, one by one.
235
+
The whole trick here is to align files, so that we don't need to memorize all the files in memory and we compare all equivalent files together: to do that, we ensure that we walk through the input directories in alphabetical order, and we pick the relative filepath at the top of the alphabetical order, this ensures the alignment of files between different folders, without memorizing the whole trees structures.
236
+
'''
223
237
# (Generator) Files Synchronization Algorithm:
224
238
# Needs a function stable_dir_walking, which will walk through directories recursively but in always the same order on all platforms (same order for files but also for folders), whatever order it is, as long as it is stable.
225
239
# Until there's no file in any of the input folders to be processed:
0 commit comments