A simple change to the policy of when Windows Lightroom Classic writes out its preferences file could avoid most “corruptions" of preferences that cause so much flaky behavior experienced by users.
The current policy is: The file gets written 30 seconds after the most recent assignment to a preferences key or 60 seconds after the first assignment to a key after the file was last written, whichever comes first. When LR exits, the preference file is written after all user-interface tasks have terminated.
The new policy should drop the 60-second rule: The file gets written 30 seconds after the most recent assignment to a preferences key. When LR exits, the preference file is written after all user-interface tasks have exited.
In addition, Adobe should verify that when LR shuts down, every task that stores preference values terminates properly before the preferences are written to disk.
Though these changes are simple, justifying them takes a lot of explanation.
Note that I have no inside information about LR’s internals, and everything written here is based on observing LR’s behavior and my dozen years’ experience developing LR plugins, pushing the limits of the plugin SDK. I may have made some incorrect inferences, and if so, I’d love to be corrected.
The Preferences Design Flaw: A Race Condition
LR stores a large amount of internal application state in the preferences file, not just user settings, and if that state gets recorded inconsistently, then chaos in the user interface results. Unfortunately, the preferences implementation has a design flaw that makes it too easy for LR to store inconsistent values.
Internally, LR’s preferences are implemented as a special Lua table of key-value pairs. LR updates a key’s value with a simple assignment statement:
prefs.key = value
A background task periodically writes the table to the preferences file. But there is no mechanism to ensure the background task gets a consistent snapshot of the table. Consider a LR module that updates two preference keys:
prefs.key1 = e1
prefs.key2 = e2
If the background task grabs a snapshot of the preferences table in between the two assignments, the preferences file will contain the new value for key1 and the old value for key2. If LR is then restarted and the module reads the inconsistent values for key1 and key2, it could get very confused.
The background task could run in between the two assignments if the expression e2 directly or indirectly makes an API call that yields to other tasks (e.g. to do file i/o). In practice, with complicated programs like LR composed of many layers of abstraction, it can be very difficult for a programmer to determine whether an expression could yield.
Other applications use preemptive multi-tasking with robust concurrency mechanisms to avoid such race conditions. But the LR user interface was implemented with non-preemptive tasks and mostly without explicit synchronization primitives, and, as a practical matter retrofitting hundreds of thousands of lines of legacy code with synchronization isn’t going to happen.
The Current Preferences File Writing Policy
Currently, the background preferences task writes the preferences file:
- 30 seconds after the most recent assignment to a preferences key, or
- 60 seconds after the first assignment to a preferences key after the file was last written,
whichever comes first. (This is easy to observe using Windows File Explorer.) Each assignment to a key resets the 30-second timer.
The first rule delays writing the preferences file while the application is making rapid changes to preference values. The second rule ensures that any changed preference value will get written to the file within 60 seconds.
It’s the 60-second rule that in practice causes the race condition. Consider the two assignments, where the expression e2 invokes a yielding API call, allowing the background preferences task to run:
prefs.key1 = e1
prefs.key2 = e2
The preceding assignment to key1 almost certainly occurs much less than 30 seconds before the background task runs, so the first rule won’t be triggered. (Only if expression e2 takes more than 30 seconds to execute would the first rule be triggered.) (The delay between the assignment to key1 and the resumption of the background task could be more than 30 seconds if LR is very loaded down, e.g. while doing an export on a machine with too little memory and too much virtual-memory paging.)
But if it’s been at least 60 seconds since the first assignment to preferences after the file was last written, the second rule will trigger, and the background task will write the new value of key1 and the old value of key2 to the file.
At this point, there are two ways things could go wrong.
First, if LR exits abnormally, due to a software or hardware crash or the user force-quitting LR, the preferences file remains with the inconsistent values, which will be read when LR next starts.
Second, if LR exits normally, it writes the preferences file one last time before exiting. I believe that internally, LR has a shutdown mechanism to properly terminate tasks before writing the preferences file, ensuring that consistent preference values are written. But if there are some tasks that don’t get shut down properly, they could be modifying preferences as the file is written for the last time, resulting in inconsistent values.
A New Preferences File Writing Policy
The proposed file-writing policy is simple: The file gets written 30 seconds after the most recent assignment to a preferences key. When LR exits, the preference file is written after all user-interface tasks have terminated.
The 60-second rule, writing the file 60 seconds after the first assignment to a preferences key after the file was last written, is dropped.
This ensures that in nearly all cases, the preference file won’t be written between key assignments that require consistency. (The only exception is when an expression assigned to a key yields for more than 30 seconds the background task runs more than 30 seconds after the assignment to key1, which happens at most very rarely.) Thus, the preferences file is much, much more likely to contain consistent values.
What are the consequences of dropping the 60-second rule? When LR exits normally, there are no consequences, since LR writes out a consistent snapshot of the preferences right before exiting.
But if LR exits abnormally, consider the alternative states of the preferences file:
- With the 60-second rule, the preferences file will contain key values that are no older than 60 seconds, capturing a very recent state of the application. But the file could contain inconsistent values, causing the incorrect behavior reported on the community forums and the countless recommendations to “reset preferences”.
- Without the 60-second rule, the preferences file could contain key values that are much older, going back to when the user last paused for at least 30 seconds, which might be tens of minutes ago. But the captured values are consistent, and LR will behave correctly on restart.
The second alternative is clearly preferable. After an abnormal exit, users may be mildly disappointed that some changes they made to the user-interface state were lost, but they’ll instinctively know how to recover from that easily. But with the first alternative, with LR now misbehaving in strange new ways or even crashing, the user has little idea what’s going wrong or how to recover.
Finally, the team should verify that, when LR exits, all tasks that write preferences have actually terminated before writing the preferences file.
Demonstrating the Race Condition
It’s easy to demonstrate the race condition between assigning preference keys and the background task writing the preferences file. Install this plugin in a Windows LR:
https://www.dropbox.com/s/1gufmz0bwubu8o9/testprefswriting.lrdevplugin.zip?dl=0
Do File > Plug-in Extras > Test Preferences Writing, which starts a background task testing for the race condition.
Next, select about a thousand photos and either scroll Grid view continuously with the Page Up and Page Down keys or do long-running operations such as Read Metadata From File, Save Metadata To File, syncing a change to Exposure, or exporting the photos. Usually an inconsistency is detected within 30 – 90 seconds, and the plugin will show this message and terminate:
The plugin’s background task executes this loop:
for i = 1, math.huge do
prefs.key1 = i
sleep (0.001)
prefs.key2 = i
The sleep() simulates an expression that yields to another task during its evaluation.
The plugin checks the contents of the preferences file once a second, checking whether key1 and key2 have identical values.
Performing long-running user operations during the plugin’s execution will delay the sleep() call’s return, sometimes by as much as 10 seconds, making it more likely that the scheduler will run the background preferences task before the sleep() returns.
Windows versus Mac
I’ve done most of my close examination of this issue on Windows, because it’s easier to observe when preferences get saved to a file. Mac LR uses the “plist” mechanism of Mac OS, which caches preferences in memory and does the writing of preferences to disk, and this makes it harder to observe from the outside what LR is doing. I’m sure that a similar race condition exists on Mac, though perhaps at a different frequency. But the same general approach should apply.