JLop - hole algorithm ------------------------------------ Hole algorithm solves this problem : ------------------------------------ How find first (most) useful byte to download, in a peer2peer multi source download session ? We have, generally, some downloaded fragments for the same file, that does not complete the entire file. So we have some gaps (holes) to fill with a new download process from another peer. Notice that some fragments are 'dead' and some others are 'alive' (still connected to other peer in downloading phase). In other words some fragments are of fixed size, others are growing in size by time. ---------------------------- Hole algorithm specification ---------------------------- The gaps (holes) between fragments can be so definited as : Dead hole : sequential missing bytes of the final file. The upper index limit does not move (the previous fragment is a broken download or is the begin of file). Alive hole : sequential missing bytes of the final file. The upper index limit moves downward (the previous fragment is a running downloading process). The hole algorithm first try to find the early "dead hole" between fragments, if a "dead hole" can't be found, try to find the early "alive hole". In the "dead hole" case we return the first missing byte of the hole. In the "alive hole" case we return the middle missing byte of the hole. If there is not dead nor alive holes so the file is complete. NOTICE : The JLop downloader process generally does not end at the end of the fragment, so to have the probability to download another following missing fragment, we not broke a good peer connection. Mainly for this behavior that the hole algorithm chooses ever the earlier fragment. If the fragment is finished, let's the downloader run until the end of file (or until we found that the file can be fully rebuild): maybe this downloader hits another next missing hole. --------------------- Overlapping fragments --------------------- The 'fragments' in algorithm can overlap one on each other, so a more exact definition of hole is : let F : File "block of sequential bytes" let Ci : Downloaded fragment number i "(sub)blocks of sequential known (downloaded) bytes of file F". let Gi : Hole number i "(sub)blocks of contiguous unknown bytes of file F, bytes that aren't contained in each Ci fragment" (Gi = F - (union of all Ci) ) ------------------------- Rebuilding the final file ------------------------- When all fragments of the file are downloaded (when there are no more holes) we can rebuild the file. When the rebuilder process scans for fragments for making the final file it peeks the longer first fragment, then the longer fragment that contains first missing byte after previous fragment, and so on. Look at the example in the next paragraph. ---------------------------- Hole algorithm examples ---------------------------- .....................0123456789ABCDEF EG-1: 16 bytes file "AAAAbbbbCCCCdddd" AAAA (fragment - stopped) bbbb (missing -- dead hole) CCCC (fragment - running) dddd (missing -- alive hole) We return offset 4 from wich start next download .....................0123456789ABCDEF EG-2: 16 bytes file "AAAABBBBCCCCdddd" AAAA (fragment - stopped) BBBB (fragment - stopped) CCCC (fragment - running) dddd (missing -- alive hole) We return offset E from wich start next download .....................0123456789ABCDEF EG-3: 16 bytes file "AAbbCCCCddEEffff" AA (fragment - stopped) bb (missing -- dead hole) CCCC (fragment - running) dd (missing -- alive hole) EE (fragment - stopped) ffff (missing -- dead hole) We return offset 2 from wich start next download. Notice that this download process will not stop when reaching byte 4. So the download can potentially spawn upon fragments dd and ffff When the rebuild process search for fragments it peeks the longer fragments set that match the final file Rebuilder example : .....................0123456789ABCDEF EG-3: 16 bytes file "AABBCCCCDDEEFGHI" fully downloaded AA (fragment 1) BBCCCCDDEEF (fragment 2) (note that this fragment overlaps on others) CCCCD (fragment 3) (note that this fragment overlaps on the next) DD (fragment 4) EE (fragment 5) FGHI (fragment 6) The rebuilder will use fragment 1, 2, and 6 from D position index : 01 23456789ABC DEF bytes : AA BBCCCCDDEEF GHI fragm : 11 22222222222 666 Fragments 3,4,5 (and first byte of fragment 6) will be discarded, the longer early fragment (2) will be choose.