Video is pixels not DIP or PPI so you want enough pixels.
If you are working 1920 X 1080 and you want to move in on the image so you see only half of it then the size of the scanned image should be 2X the width of the composition, or two times the height if the image is horizontal.
When doing 2.5d animation where you break up a still image into foreground, middle ground, and background elements and arrange them in 3D space you end up working with both scale and distance from the camera. Usually the camera moves are very small so most of the time 2X comp size is all you need unless the camera move is a straight push in.
Here's the math involved. You don't want any images in your composition that are at an effective scale of more than 100% but you also don't want images that are so large that they are not ever at an effective scale of something close to 100%. What is effective scale? If a camera is added to a scene at it's default position and footage is added to a scene at it's default position and made a 3D layer and the footage (image) is the same frame size as the composition, then footage layer will be at 100% scale and the distance between the camera will be equal to the zoom value of the camera. The effective scale is 100%. If you move the layer farther away from the camera and you still want it to fill the frame you increase the scale so that the layer still fills the frame. If you move the image a long way from the camera and have to adjust the scale value to 1000% to fill the frame the effective scale is still 100%.
Did you follow that? The same thing applies to moving the camera or the layer closer than the zoom value. If you reduce the distance between the camera and the layer you increase the effective scale and things start to fall apart fast. Adding a little motion blur to a move will help if you are moving way in but if you do things like fly through a window you have to be careful to make sure that you are close to an effective scale of no more than about 150% to keep things looking good with motion blur.
I hope this helps. Most of the time folks think that DPI or PPI is important but there are no inches in video so a 1 PPI image will be the same size as a 9999PPI image when loaded into a composition.
As for format, I would not use JPEG, I would use Tiff and even 16 or 32 bit if you have that option because the Photoshop work of separating the image into foreground, middle ground and background images will be easier. To properly set up your scanner you'll need to measure the images and adjust the scanning rate (the PPI) to the appropriate value. A 5 X 7 image scanned at 200 PPI will give you an image that is 1000 by 1400 pixels. If you have an photo and you want to move in on 1 square inch of the photo in an HD comp you'll need to set the smaller to about 2000 PPI. You can figure it out using these principals.
I hope this helps.