I don't think we are anywhere near having software that can look at a scene, decide what needs to be done, and give you a time estimate or set a budget But a simpler version of that could just ask the user what is in the shot and details about the shot and give an answer based on that (eg. ask the number of people in the shot who need rotoscoping, the number of animals (such as horses) that need roto, ask how much each move, it could ask whether or how many things were in the shot that change very frequently (such as a flag flapping frequently in the wind which could require many keyframes), maybe it could ask if/how much very fine detail was in the shot that need to be preserved (eg. fine hair strands would be complex). In theory if the right info is asked for and it can be done quickly enough it should be able to give faster, more accurate quotes than it otherwise would. All the motion prediction in the world isn't going to be able to figure out which motion is the actor and which is the background, and what parts need to go and what parts need to stay Though recently it's been the shots with a lot of motion complexity (eg. horses galloping with multiple horses and people, people running, flags moving rapidly in the wind, sword fighting with multiple people), that are also long duration (up to 46 secs at 4K), that are more complex than simpler videos. These videos I think would have motion vector complexities that were indicative of actual complexity. The program could ask if the overall motion of the video was indicative of the main things that needed rotoscoping (eg. foreground subjects), and it could take that answer into account when determining it's calculation (of roto time/complexity and maybe the price to quote).
... View more