can we see the shot? basically it would require:
1. stabilizing the shot
2. clone (not frame by frame*) or other ways of removing the drawing (like an effect of some kind or exporting a frame to Ps, cleaning, then returning).
3. re-introduce the motion back
4. rotoscoping the Arm
5. fix anything that need fixing (like lighting changes)
here's a thread demonstrating such technique:
Re: Problems with tracking
whatever technique you choose, it's not easy stuff and require a lot of practice and effort from a beginner.
*the reason frame by frame does not work for cleaning up a video is that you could never overcome the inconsistencies of the brush strokes in time. it will be very apparent that the pixels were altered. the only time frame by frame works is if the movement is abrupt or very fast.