# LRValidate - Validate image data from Lightroom Catalog



## Linwood Ferguson (Mar 16, 2016)

Over in another thread (here) we got into a discussion about a program I built some time ago.  I feel we are hijacking that thread somewhat (even though it is on point somewhat also), so I would like to move the discussion here, if indeed there is followup discussion.

First, a bit of background.  I feel that Lightroom does a fair job of keeping track of its own catalog's integrity, and a mediocre job of tracking whether the images the catalog points to retain their integrity.  It can check if they still exist, but not if something (failing disk, run amok software, etc.) has changed them, and not directly check if they are still readable/usable.

Adobe added some checks to improve this with DNG's, where the DNG itself contains a checksum of the image data (but not associated data, such as metadata you write back into it, or at least as far as I know).  

Coming from an I.T. background (i.e. having learned you can never trust computers), I wanted something that did a more thorough check, and the result was a open source project here: 

LRValidate - Validate Images for Adobe Lightroom(R) - Home

This is a standalone program which sets up a validation environment in your catalog (basically a list of all the files it has checked, and their checksum) and then will periodically validate that they are unchanged from the last time you checked.   It validates the entire file, so any change in metadata or image data - any bits flipped by failing disks - any change will be detected. 

The idea is that in Lightroom, MOSTLY, we do not edit original files.  When we do, it is usually something we are aware of (e.g. edit-original in something like photoshop) and do typically on recent photos.  So if you see a 3 year old TIFF changed, you likely either remember doing it, or have a problem.

The program is a standalone executable, Windows only.  That is mostly for performance -- it is heavily multi-threaded, and while it takes a long time to run, it is very efficient -- it has to read every byte of every file, and will generally saturate your disk(s), and is best run overnight if it is doing a comprehensive run.

It is free for use, and source code is provided for criticism or correction or just so you can inspect and make sure I am not doing something bad.  You can build from source and run that, or a 32 and 64 bit install kit is included.

I'll put a couple replies in with issues raised in the other thread.


----------



## Roelof Moorlag (Mar 16, 2016)

Very good Ferguson!
We met in the other thread you mentioned, i'm enthousiastic about your program. I like the opportunity to check on all files managed by lightroom, not only DNG's.
I did some little tests but i'm going to test on my main catalog soon and will let you know my results.
Two suggestions for the application on this moment: 

1. In the 'review Re-validation errors' screen it would be handy to go to the 'more information' screen by a double-click
2. In the 'review Re-validation errors' screen it's possible to rightclick an image and choose 'accept checksum difference'. It would be nice to skip the Confirmation than.


----------



## Linwood Ferguson (Mar 16, 2016)

Some issues that came up in the other thread:

*1) What happens on minor and major upgrades: *

While no one knows for sure what Adobe will do, in the past, minor upgrades left the catalog in place, and major upgrades copied it to a new version.  The data which is used for validation resides in the catalog, but NOT in any of Adobe's tables, so the catalog changes the program makes should not conflict with anything Adobe does.  Now it is possible Adobe can change something that breaks this program, but as the minimal information it uses (basically image existence and location) is pretty fundamental, it seems unlikely.  It has survived a half dozen or so minor upgrades without issue. 

For major upgrades, Adobe copies the database that is the catalog, and as best I can tell does so table by table.  This means the validation information is lost during a major upgrade.  There is one easy solution, and one not so easy.  

The easy solution is validate your pre-upgrade database before upgrading, then do a "Find New" on the new catalog and let it just start over, it will rebuild the validation information.  The odds of a corrupt image file in between that is thus missed is slight, but non-zero.

You can also use any of the various SQLite utilities to copy the two tables (LightroomValidateErrors, LightroomValidateImages) to the new catalog, and it will pick up where it left off.

*2) What happens for other media types*

The program pretty much ignores media type, and will checksum whatever data is pointed to by the catalog - video, image, etc. When it is displaying information (trying to help you resolve changes) it may try to display the image, and uses the standard windows image display functions -- if the thing being displayed will not display, it comes up blank (or conceivably fails and aborts). 

*3) How long does it take to initialize*

It depends mostly on the speed of your disks, but maybe the speed of the CPU.  It creates many threads and runs them independently, and will generally saturate your disk or CPU doing the image reads and checksums.  I have done about 50,000 images, and left it running over night, and it finished by the next morning.  I do not know how long it actually took, but I believe most of the night.

You can stop it and restart it as you like -- there is a cancel button that appears when it is running which will stop it (it takes a few minutes to wind down the threads).   

*4) Is it safe*

I believe it to be, and welcome review of the code by any of the programmers around.  However, like anything else that touches your catalog, there is always some risk involved.  You can run it on a separate copy of your catalog if you prefer, though of course it cannot then incorporate new changes.  Backing up your catalog is always a good idea, not just for this.

It does not write to other files, such as your images, at all; it only reads from them.


----------



## Linwood Ferguson (Mar 16, 2016)

Roelof, with regard to your issue on a 4.1 DNG crashing it.

From here

I was able to reproduce that as well, and also to see that it crashes windows Explorer on Windows 10 x 64.

The exception code would seem to indicate a heap corruption, and my guess is that it's buried somewhere in the windows codec that handles that file type, which ultimately may originate with Adobe.

The validation program, to display an image, reads it as a buffered stream into a memory stream then turns it into a bipmapimage.  All the hard work of deciding what it actually is occurs inside the windows components.   The code that does this has a standard exception handler, which is not able to catch this sort of exception as it is at a more privileged code level, so there is nothing straightfoward I can do to fix it (I could check explicitly and not try to display DNG's, but this seems specific to a version of DNG). 

I suspect it is more productive to try to find out if Adobe or Microsoft is planning to fix it.

Or upgrade to a newer DNG format perhaps.


----------



## Linwood Ferguson (Mar 16, 2016)

Roelof Moorlag said:


> V
> Two suggestions for the application on this moment:
> 
> 1. In the 'review Re-validation errors' screen it would be handy to go to the 'more information' screen by a double-click
> 2. In the 'review Re-validation errors' screen it's possible to rightclick an image and choose 'accept checksum difference'. It would be nice to skip the Confirmation than.



I think both of those are easy.  That screen does allow multi-select and a double click would need to be specific only to a single item I think.

The warnings... yes, maybe.   I tried to be very conservative so that people using this without reading any documentation (first, there is not a lot, and secondly no one reads it anyway) would understand what they are doing and any implications.

Maybe show it once and suppress?   Or a "Do not show again"? 

Let's see if anyone else actually likes it.  Honestly this was more of a dead end for me.  I use it, but back a couple years ago I posted a couple places about it, but never got any real interest from people.

Which may also explain why its features are not built into Lightroom -- I think people just do not demand/expect much in the way of data validation and redundancy.  There's a huge amount of trust for computers and software, and not much market for belt-and-suspenders type software.


----------



## Roelof Moorlag (Mar 16, 2016)

Ferguson said:


> Maybe show it once and suppress? Or a "Do not show again"?


Both are very good options. In the beginning warnings are usefull. Later on it's nice you can hide them.



Ferguson said:


> Let's see if anyone else actually likes it. Honestly this was more of a dead end for me. I use it, but back a couple years ago I posted a couple places about it, but never got any real interest from people.
> 
> Which may also explain why its features are not built into Lightroom -- I think people just do not demand/expect much in the way of data validation and redundancy. There's a huge amount of trust for computers and software, and not much market for belt-and-suspenders type software.


Very strange indeed, this non-interest. Since i read the DAM book from Peter Krogh i'm searching for tooling that does data validation and i was very excited when Adobe introduced DNG validation in Lightroom. I was not aware of your initiative. Maybe more people simply do not know this and thats why it's good to pay attention again here.


----------



## Linwood Ferguson (Mar 16, 2016)

Roelof Moorlag said:


> Very strange indeed, this non-interest. Since i read the DAM book from Peter Krogh i'm searching for tooling that does data validation and i was very excited when Adobe introduced DNG validation in Lightroom. I was not aware of your initiative. Maybe more people simply do not know this and thats why it's good to pay attention again here.



I had a theory for business that may apply.

Long ago, the average tenure of a System Administrator (the guy responsible for servers for a business, whatever their title) was many years.  At the same time, the average time between failure of disks was short.  A System Administrator could be almost assured of being responsible for restoring their systems from backup frequently while he or she was there.

AS time went on, job hopping became more prevalent, and it the average tenure went down.

Meanwhile, computer reliability went up.  And up a lot.  Raid meant that failures that lose data were even less likely.  The average time between needing to actually restore from backup went way up.

Eventually they crossed, and it was likely that during one's tenure at a job, you were never actually responsible for restoring a system from backup, and even if you were, it ways very likely one someone else planned the backups for (i.e. you had someone to blame). 

The result was, over time, system administrators because FAR, FAR less paranoid in their backup planning.  Only companies with good IT auditors or CIO's who were technically competent and checked really had good, solid backup plans, because your average system admin just didn't (couldn't, wouldn't, whichever.).

I think a lot of similar progress has happened in personal computer hardware.  Rather than a disk failing regularly for home users, they typically replace the computer before they have a serious data loss.   Depending on your viewpoint, they either get complacent, or they don't worry about non-problems (I land on "complacent"). 

But I spent way too long on the receiving ends of Murphy's Law.  I want belts and suspenders, and even then I still make sure I have clean underwear on just in case.


----------



## Roelof Moorlag (Mar 16, 2016)

Ferguson said:


> Roelof, with regard to your issue on a 4.1 DNG crashing it.
> 
> From here
> 
> I was able to reproduce that as well, and also to see that it crashes windows Explorer on Windows 10 x 64.


How unexpected you could reproduce! I tried it on another Windows 10/x64 system and i could not. So i assumed it was a local problem on my system. Also i could not find any information about it. 



Ferguson said:


> I suspect it is more productive to try to find out if Adobe or Microsoft is planning to fix it.


Yes, i think this too. When windows explorer handles the file well your program wil do also i think
It's only that i not know how to adres this issue at adobe and/or microsoft..



Ferguson said:


> Or upgrade to a newer DNG format perhaps.


There is not a newer DNG format yet so we have to wait


----------



## Linwood Ferguson (Mar 16, 2016)

All I had to do was switch to Large Icon view in explorer and it dies and vanishes.


Roelof Moorlag said:


> There is not a newer DNG format yet so we have to wait


Sorry, I thought you said it was the 4.1 version (newest is 7.1).    But I tried re-exporting it under a different version without error in the export, but it still fails.

Is it just the Fuji X100T?   

But yes, for me it takes down Windows Explorer immediately on it trying to render.  Interestingly IrfanView works fine, and Photo Mechanic works fine (both in thumbnail and full res view).  Something in Windows.  Maybe you'll get more response over in that thread you started.

But LR Validate will tell you if it changes.


----------



## Roelof Moorlag (Mar 16, 2016)

Ferguson said:


> Sorry, I thought you said it was the 4.1 version (newest is 7.1)


Yes the DNG's i made from the Fuji X100T with the compatibility setting with Lightroom 4.1 (= Camera Raw 7.1 > there is the confusion) are causing the problem. When i choose a 'lower' compatibility setting nothing is wrong.

For testing purposes i tried to make a DNG from my NEF files with the same compatiblity setting (LR 4.1) but that did not work out. I tried it with Lightroom and DNG converter but in all cases the result is a DNG with compatibility Lightroom 1.0...
So, i do not know yet if the problem is the RAF source or the DNG result..


----------



## PhilBurton (Mar 24, 2016)

Ferguson said:


> Over in another thread (here) we got into a discussion about a program I built some time ago.  I feel we are hijacking that thread somewhat (even though it is on point somewhat also), so I would like to move the discussion here, if indeed there is followup discussion.
> 
> First, a bit of background.  I feel that Lightroom does a fair job of keeping track of its own catalog's integrity, and a mediocre job of tracking whether the images the catalog points to retain their integrity.  It can check if they still exist, but not if something (failing disk, run amok software, etc.) has changed them, and not directly check if they are still readable/usable.
> 
> ...


Ferguson,

Not sure if you realized the full implication of your work here.  You have just removed one of the main reasons for converting RAW files to DNG.  Good work, man.

Phil


----------



## Linwood Ferguson (Mar 24, 2016)

PhilBurton said:


> Not sure if you realized the full implication of your work here.  You have just removed one of the main reasons for converting RAW files to DNG.  Good work, man.



Thanks.  I hope that doesn't put me on some Adobe hit list.


----------

