This bug was diagnosed and the fix was identified by Nathaniel Filardo of Johns Hopkins. All I'm providing is an easy test case and a writeup. There is a permanent-data-loss bug in venti/copy: if it is copying a block of hashes, and the last byte(s) of the last hash in the block are zero, then the copy loop skips copying that hash (and all of its descendants). This is low-probability (I think roughly 1/(256*VtEntrySize) = .01%) but very reproducible. I have enclosed these files to help reproduce the bug: * "find-trouble.c" locates an 8K hunk of the fortune database which has a SHA1 hash ending with a zero byte * "trouble" is an 8K file containing such a hunk * "transcript" shows how to use vac to store an "un-copyable" vac archive into one venti and how to verify that venti/copy skipped copying the "trouble" file from that venti to another. It is probably important to make sure that, before running vac, the "trouble" file was added to the directory last, so that vac will add the file's hash to the *end* of the relevant block. P.S. The particular form of the change to the loop test matches some code in P9P's venti/copy: http://tinyurl.com/neuujzc Reference: /n/sources/patch/venti-copy-hash0-skip Date: Fri Jun 20 22:46:33 CES 2014 Signed-off-by: davide+p9@cs.cmu.edu --- /sys/src/cmd/venti/copy.c Fri Jun 13 23:45:33 2014 +++ /sys/src/cmd/venti/copy.c Fri Jun 13 23:45:32 2014 @@ -123,7 +123,7 @@ break; case VtDirType: - for(i=0; i +#include +#include +#include +#include + +Biobufhdr bin; +uchar binbuf[256*1024]; + +void +main(int argc, char *argv[]) +{ + char fbuf[8192]; + uchar sha1digest[SHA1dlen]; + char *fortunepath = "/sys/games/lib/fortunes"; + int fortunes; + int got; + + if ((fortunes = open(fortunepath, OREAD)) < 0) + sysfatal("%s: %r", fortunepath); + + Binits(&bin, fortunes, OREAD, binbuf, sizeof binbuf); + + do { + vlong pos; + char *newline; + + pos = Boffset(&bin); + + if ((got = Bread(&bin, fbuf, sizeof fbuf)) <0) + sysfatal("%s: read(): %r", fortunepath); + + sha1((uchar *)fbuf, sizeof fbuf, sha1digest, nil); + + if (sha1digest[SHA1dlen - 1] == 0){ + write(1, fbuf, sizeof fbuf); + exits(nil); + } + /* May as well START on a fortune boundary (if we can) */ + if ((newline = strchr(fbuf, '\n')) != nil) + Bseek(&bin, pos + (newline - fbuf + 1), 0); + else + Bseek(&bin, pos + 1, 0); + } while (got == sizeof fbuf); +} --- /sys/src/cmd/venti/test-data/transcript Thu Jan 1 00:00:00 1970 +++ /sys/src/cmd/venti/test-data/transcript Fri Jun 13 23:45:34 2014 @@ -0,0 +1,58 @@ +cpu% echo $sysname +spice + +cpu% echo $venti +tcp!127.1!17034 + +cpu% cd /usr/davide/9projects/nwf-venticopy + +cpu% sha1sum * +c03e1c65fa0164390a211001733244d18341b57f 8.find-trouble +0fa00dde56be2856f9537fee51525dcb8fa42051 find-trouble.8 +a857283380ebe2895d27f3ae4502f46376a528d7 find-trouble.c +0ec1e820a48a06552d56ee934235e3969d8da500 trouble + +cpu% vac -v . +8.find-trouble +find-trouble.8 +find-trouble.c +trouble +vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 + +cpu% unvac -tv vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 +--rwxrwxr-x davide davide 66041 2014-06-09 18:00 8.find-trouble +--rw-rw-r-- davide davide 2315 2014-06-09 18:00 find-trouble.8 +--rw-rw-r-- davide davide 1163 2014-06-09 17:59 find-trouble.c +--rw-rw-r-- davide davide 8192 2014-06-09 18:01 trouble + +cpu% unvac -c vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 trouble | cmp /fd/0 trouble +cpu% + +cpu% venti/copy $venti tcp!depraz.ugrad.cs.cmu.edu!17036 f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 + +cpu% unvac -h tcp!depraz.ugrad.cs.cmu.edu!17036 -c vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 trouble | cmp /fd/0 trouble +EOF on /fd/0 after 0 bytes + +cpu% unvac -h tcp!depraz.ugrad.cs.cmu.edu!17036 -c vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 find-trouble.c | cmp /fd/0 find-trouble.c + +cpu% unvac -h tcp!depraz.ugrad.cs.cmu.edu!17036 -c vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 find-trouble.8 | cmp /fd/0 find-trouble.8 + +cpu% venti/read -h $venti a857283380ebe2895d27f3ae4502f46376a528d7 > /dev/null +venti/read -htcp!127.1!17034 a857283380ebe2895d27f3ae4502f46376a528d7 0 +cpu% venti/read -h $venti 0ec1e820a48a06552d56ee934235e3969d8da500 > /dev/null +venti/read -htcp!127.1!17034 0ec1e820a48a06552d56ee934235e3969d8da500 0 +cpu% + +cpu% venti/read -h tcp!depraz.ugrad.cs.cmu.edu!17036 a857283380ebe2895d27f3ae4502f46376a528d7 > /dev/null +venti/read -htcp!depraz.ugrad.cs.cmu.edu!17036 a857283380ebe2895d27f3ae4502f46376a528d7 0 +cpu% venti/read -h tcp!depraz.ugrad.cs.cmu.edu!17036 0ec1e820a48a06552d56ee934235e3969d8da500 > /dev/null +venti/read: could not read block: no block with score 0ec1e820a48a06552d56ee934235e3969d8da500/16 exists + +cpu% /usr/davide/trees/nwf-venticopy/files/cmd/venti/8.copy $venti tcp!depraz.ugrad.cs.cmu.edu!17036 f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 + +cpu% venti/read -h tcp!depraz.ugrad.cs.cmu.edu!17036 0ec1e820a48a06552d56ee934235e3969d8da500 > /dev/null +venti/read -htcp!depraz.ugrad.cs.cmu.edu!17036 0ec1e820a48a06552d56ee934235e3969d8da500 0 + +cpu% unvac -h tcp!depraz.ugrad.cs.cmu.edu!17036 -c vac:f19254ae7a33e7f9460cbfe1dd8f6a259b5ff287 trouble | cmp /fd/0 trouble +cpu% + --- /sys/src/cmd/venti/test-data/trouble Thu Jan 1 00:00:00 1970 +++ /sys/src/cmd/venti/test-data/trouble Fri Jun 13 23:45:34 2014 @@ -0,0 +1,145 @@ + Tape dump of all filesystems taken this afternoon. +"Analysis" is "design" spelled backwards. +"RULE 7: Option arguments cannot be optional." Sys V Interface p 343. +"Sound advice" is usually mostly sound with not much advice. +"The Dresser," first a play, then a movie, then AT&T headquarters. +"The more you drive ... the less intelligent you are." -- Miller, in Repo Man +"x.c", line 1: cannot recover from earlier errors: goodbye! +#(cat probably won't work on 8 bit files, you will have to use a simpler filter) +$! nulled, predecessor circle +$3,000,000 +%-W-NORML Normal Successful Completion +'Tis a gross error, held in schools, That Fortune always favours fools. - John Gay. +'Tis better playing with a lion's whelp,/Than with an old one dying. +'Tis nobler in the mind to Look Things Up than to Pull Them Out Of The Air. +(^|[ (,;])(([Jj]ul[^ ]* *|(07|7)/)0*5)([^0123456789]|$) +* UNIX is a Trademark of Bell Laboratories. +*** LGP ERROR ***: Initialize -- Cannot initialize graphics hardware device. +*** REPLACE THIS LINE WITH YOUR MESSAGE *** +-1: No code table for op: ++post +...and on the seventh day, He exited from append mode. +/usr/games/fortune: not found +/usr/news/gotcha +1 bulls, 3 cows +127 now in the unregulated subsidiary; see /usr/news/btl-split +2 is always smaller than 3, even for large values of 2. +2 lacks generality in a way that 1 doesn't. +23. ... r-q1 +355/113 -- not the famous irrational number PI, but an incredible simulation. +4.2BSD is like a nightmare about Tenex. - Geoff Collyer +4.2BSD may not be a complete disaster, but it does a good job of emulating one. +55mph -- It's not a good idea, it's just the law. +: is not an identifier +A 10.0-szer 0.1 sohasem 1.0! +A 6-char limit is like a 6-inch trout: throw it back. +A 6-char limit is like night. +A Cray is the best machine for simulating the performance of a Cray. +A Point is a pair of shorts. +A Smith and Wesson beats four aces. +A bad workman quarrels with his tools. +A big book is a big nuisance. - Callimachus, librarian of Alexandria +A billion here, a billion there; soon you're talking real money. -E. Dirksen +A bird in hand is safer than one overhead. +A bird in the hand is worth what it will bring. +A block grant is a nice terminal, but it will keep you awake until noon. +A block grant is a solid mass of money surrounded on all sides by governors. +A chicken is just an egg's way of making more eggs. +A closed mouth gathers no foot. +A conservative is one who is too cowardly to fight and too fat to run. +A consistent indentation style is the hobgoblin of little minds. +A couch is as good as a chair. +A coward is one who in a perilous emergency thinks with his legs. +A critic is a legless man who teaches running. +A day for firm decisions!!!!! Or is it? +A day without sunshine is like night. +A duck with three wings and a loaf of bread is brother to the sun god. +A duck with three wings and a loaf of bread is brother to the turkey. +A fly by night leaves no shadow beyond a doubt. +A fool and his money stabilize the economy. +A foolish consistency is the hobgoblin of little minds. +A furore Normanorum libera nos, O Domine! +A game, a teaching aid, a sport, and a piece of art. -Erno Rubik +A gentleman is one who is never rude unintentionally. -Noel Coward +A good memory does not equal pale ink. +A good plan today is better than a perfect plan tomorrow. +A goodly apple rotten at the heart:/Oh, what a goodly outside falsehood hath! +A hacker does for love what others would not do for money. +A homeowner's reach should exceed his grasp, or what's a weekend for? +A hydrogen bomb doesn't care how brave you are. +A hypothesis is an opinion that you are trying to prove true. +A journey of a thousand miles begins with a cash advance from Sam. +A king's castle is his home. +A liberal is one too open-minded to take his own side in an argument. +A lone dime always gets the number nearly right. +A man does not attain the status of Galileo merely because he is persecuted; he must also be right. -Stephen Jay Gould +A man must destroy himself before others can destroy him. -Mong Tse +A man who fishes for marlin in ponds will put his money in Etruscan bonds. +A man who turns green has eschewed protein. +A man with 3 wings and a dictionary is cousin to the turkey. +A man without a faith is like a fish without a bicycle. +A megabyte here, a megabyte there, and pretty soon you're talking real power. +A megaflop is a failure of gigantic proportions. +A mighty maze! but not without a plan. -Pope +A more wretched hive of scum and villainy: not found. +A movie studio is the best toy a boy ever had. -Orson Welles +A penny saved is a political breakthrough. +A penny saved is ridiculous. +A penny saved kills your career in the Pentagon. +A philosopher does not need a torch to gather glow-worms by at mid-day. -- Earnest Bramah +A plague o' both your houses! They have made worms' meat of me. +A plucked goose doesn't lay golden eggs. +A poet who reads his verse in public may have other nasty habits. +A professor is one who talks in someone else's sleep. +A radical is a person with both feet firmly planted in the air. +A real Initiation never ends. -Aleister Crowley +A really busy person never knows how much he weighs. -Edgar Watson Howe +A recently completed trial proved TOAD Generic 1 to be reliable, user-friendly, and convenient. +A resort area will be part of your next holiday plans +A rolling stone gathers momentum. +A rose by any other name would still have thorns. +A sharp tongue is the only edge tool that grows keener with constant use. -W. Irving +A song in time is worth a dime. +A stitch in time keeps your tu-tu from becoming a four-four. +A system programmer is someone who debugs his programs with an oscilloscope. +A theory is better than an explanation. +A thousand throats may be slit in one night by a running man. +A tree is best measured when it is down. +A truly wise man never plays leapfrog with a Unicorn. +A twisting road will take you to Warsaw, but you won't be bored. +A u.f.o. closely encountered is no longer a u.f.o. +A victory is the greatest tragedy in the world - except a defeat. -Wellington +A watched terminal never prints. +A wise man never tries to warm himself in front of a painting of a fire. +A witty saying means nothing. -Voltaire +A woman is only a woman, but a good cigar is a smoke. -Rudyard Kipling +A578 Your FLP overflows into your KBUF. +AAAOO OOZOR AZZAZ ZAIEO AZAEI IIOZA KHOEO OOYTH OEAZA EAOOZ AKHOZ AKHEY THXAA LETHX KH +Ablata at alba. +Abortion and suicide are hereditary only if you prevent them. +About all some parents accomplish in life is either illegal, immoral or fattening. +About all some parents accomplish in life is to send a child to Harvard. +About the only thing on a farm that has an easy time is the dog. +Absence is better than a cure. +Absinthe makes the heart grow fonder. +According to the latest official figures, 43% of all statistics are worthless. +Accounting software is structured as a set of tools that can be used to build accounting systems. +Ad pulchritudinem tria requiruntur; integritas, consonantia, claritas. -Aquinas +Admiration is our polite recognition of another's resemblance to ourselves. +After 24 hours, corpses and guests smell bad. +After all, a murderer is only an extroverted suicide. +Afternoon very favorable for romance. Try a single person for a change. +Again, and strongly, undress the sheep. It is getting to visitors. +All articles that coruscate with resplendence are not truly aurifers. +All dare to write, who can or cannot read. -Horace, `Epistles', Book II +All my men wear badges, or they wear nothing at all. +All of the bridges between our today and our yesteryear have been burnt. +All of these futures having been sold, this fortune appears as a matter of record only. +All syllogisms have three parts; therefore this is not a syllogism. +All that does not glitter is not not-gold. +All the ethics in Hollywood can be rolled up and fit into a gnat's navel. +All the good ones are taken. +All the great men are dead and I'm not feeling too well myself. +All things considered, life is 9-to-5 against. +All things that are, are lights. +All you need to know is in the manual. +Almost all good computer programs contain at least one random-number ge