Jump to content

RomVault 2.0.0 released


Robert
 Share

Recommended Posts

http://www.romvault.com/

 

Hi,

So around August time of 2011, I was working with the code of RomVault 1.7 and was starting to get frustrated with the direction it was heading. The biggest problem was that the core data structured used internally by RV1.7 was based around trying to look like the list of DAT that it scanned in from DATRoot. This sounds correct but I was running into problems like this: where do I store the information about any files I find in a directory that is not in one of the DAT directories. The scanning code was starting to do more and more fixing, so that it could complete a scan, like copying unknown directories and files over to ToSort. Also as I am sure all you RV1.7 users know, there was no clean way in the data structure to store the status of the Tree of DATs as displayed in the UI, so it kept on resetting itself, as it was not really a tree structure at all, just a list of DATs displayed as a tree. This was all heading in the wrong direction fast.

 

So I realized I had the change the core data structure, as what I am really working with is actually not DATs, I am really processing a structure of Files and Directories. (The DATs are just a way of representing the list of files you want to collect.) So I started to rework the core data structure to be a structure that would hold a tree of Files and Directories, and at that time back in August 2011, I figured a couple of months of work, build the new core structure, put all the old scanning and fixing code back on top of it and RomVault 2.0 would be ready to go.

 

The new core data structure was made, and then I converted over the DAT loader, and then got the File scanner going, and everything was going along good and then I started working on the Finding Fixes code, and then got into a world of CRC's, SHA1's and MD5's, it was probably around 10 months ago we (that’s me and my great Beta testers) realized that just fixing on CRC’s and file size alone was not going to cut it. So after a couple more month Levels of scanning and fixing where born, and then after a few more months Levels of scanning and fixing where about working, and then the improved DAT loader was developed, and oh yes, the Missing report code was never converted over, and then the idea of supporting filenames greater than 260 characters came along, and fast Zip processing was added………..

 

You get the idea, the project just kept on snowballing. So here we are today around 16 months later, with what is RomVault 2.0.0, with 40'ish private Beta version being tested, about 300 check-ins to the source control. (So that works out at about one check-in every 1.5 days for over 18 months!!!,You did already click on that Donate button over there of the left, I hope.)

 

Is RomVault 2.0.0 finished? No, nothing is ever finished, there is a great big list of things still to be added to it, but it is a great big step up from 1.7. So before you ask there are 2 things still at the top of the to-do list, so Sorry, there is still no 7z support, and still no CHD support. But I will put this another way, neither of those 2 features would ever have fitted into the 1.7 data structure, and so also look out for the 2.1 development cycle starting. (As I actually already have a private test version of RomVault that does to a limit degree support 7z.)

 

So you need now to know a little about scanning levels, and how RV works.

First you need to know what a CRC32 is, and to put this as simply as possible it is a 4 byte, 8 digit (in hex) hash of your file, ZIP uses and stores the CRC32 of your file when you put a file into a ZIP, this way when it extracts your file it rechecks the CRC32 to see that your file did not become corrupt.

 

And here is an old DAT file game description, showing the crc values of the files needed for this game:

 

game (

name 2020bb

description "2020 Super Baseball"

romof neogeo

rom ( name 030-p1.bin size 524288 crc d396c9cb )

rom ( name 000-lo.lo merge 000-lo.lo size 131072 crc 5a86cff2 )

rom ( name sp-s2.sp1 merge sp-s2.sp1 size 131072 crc 9036d879 )

rom ( name sfix.sfix merge sfix.sfix size 131072 crc c2ea0cfd )

rom ( name 030-s1.bin size 131072 crc 7015b8fc )

rom ( name 030-m1.bin size 131072 crc 4cf466ec )

rom ( name 030-v1.bin size 1048576 crc d4ca364e )

rom ( name 030-v2.bin size 1048576 crc 54994455 )

rom ( name 030-c1.bin size 1048576 crc 4f5e19bd )

rom ( name 030-c2.bin size 1048576 crc d6314bf0 )

rom ( name 030-c3.bin size 1048576 crc 47fddfee )

rom ( name 030-c4.bin size 1048576 crc 780d1c4e )

)

 

So this CRC32 numbers become a finger print of your file, and RV1.7 used the CRC32 it found in the zip file header and file size to identify your files, so if you have a file in ToSort RomVault would look at the header of the zip file and read all the CRC’s and Sizes from that zip file and if a DAT requires a file that matches that CRC32 and size, then RomVault would move that file from ToSort to the correct zip file as described by the dat.

 

This is all super-fast an easy as the CRC32 was right there waiting for me to look at in the zip file information.

 

But here started the problem, the more files you start to compare this way you realize that CRC32 is really not enough, which brings us right up to date with the very latest TOSEC dat sets:

 

Commodore C64 - Games - [D64] (TOSEC-v2012-12-22_CM).dat

 

game (

name "Elvira - The Arcade Game (1991)(Flair Software)(M3)[cr MHI][t +2 MHI][a]"

description "Elvira - The Arcade Game (1991)(Flair Software)(M3)[cr MHI][t +2 MHI][a]"

rom (

name "Elvira - The Arcade Game (1991)(Flair Software)(M3)[cr MHI][t +2 MHI][a].d64"

size 174848

crc 298caa9c

md5 71700ed3cf36b0724e0dd8e68fe32ed9

sha1 0dde6ccb17ed13c05f7f3a03ee6d636f5ba56545 )

)

 

 

game (

name "Jimbo (1995)(CP Verlag)[cr FLT][a]"

description "Jimbo (1995)(CP Verlag)[cr FLT][a]"

rom (

name "Jimbo (1995)(CP Verlag)[cr FLT][a].d64"

size 174848

crc 298caa9c

md5 92e4a98d8d9c2234c066124765a5192e

sha1 ba3eb11347d159951d94d7a34a3d08c40b6d009c )

)

 

Two files from the Commodore C64 set, and as you can see they both have CRC of 298CAA9C!!!

Our system fails, CRC is not enough, and so bring on SHA1 & MD5 two other hashing methods, and as you can see much bigger longer hash values with an almost infinitely small chance now of the SHA1 checksum failing to be the correct file finger print.

 

So now back to RomVault 2.0

Level 1 scanning: still just scans the CRC/Size right out of the headers of the ZIP files, this is super fast, and if you just want to fix a mame set as quick as you can and are about to join the next mame torrent which will fixing anything that may be wrong anyway, then this is still probably enough for you.

 

Level 3 scanning: this then gets the full set of CRC,Size,SHA1 & MD5 of every file it scans which is unfortunately much much much slower that the quick Level 1 scan, as is has to fully extract every file out of every zip file and perform all of the checksums on every file. (RomVault 2.0.0 does do all of this in memory, so it tries to be as fast as it can.) But does of course guaranty all your files are truly correct.

 

And then the better Option:

Level 2 scanning: (USE THIS ONE) Level 2 scans every file just the same as Level3, getting and checking the full set of CRC, SHA1 & MD5, and the first time you run a Level 2 scan it will take a very long time, but it also stores the file time stamp of your zip files, and so the second time you run a scan it checks the file time stamp, and if the file has not changed then it does not recheck it, so the second time around it actually goes super fast again, only slowing down to read any new or changed files.

 

There is also Level 1,2 & 3 fixing, which sort of works the same as scanning.

Level 1 fixing, only fixes comparing Size and CRC.

Level 2 fixing, compares size,CRC,SHA1 & MD5, but will still copy the raw uncompressed data from one zip to another, for speed if it can.

Level 3 fixing, compares size,CRC,SHA1 & MD5, but will uncompress and recompress every file every time.

 

So with all that said, use LEVEL 2 Scanning, and LEVEL 2 Fixing, and your future should be good.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...