Many screencasters have different screencasting techniques. Some start by creating the rough video screencast with basic dialogue and then polish the audio first and the video second or viceversa. Others start with the video only and add the audio accordingly. If the audio is more important, the starting point is the audio and the video is developed following the timing of the audio(this method is mostly recommended for video presentation and animated diagrams instead of screencasts).
Recently, I managed to understand the main tune changer in the voice when recording the audio dubbing track in many takes: the distance of the microphone. If the distance of the microphone is different, the levels shown on the hardware/software audio interface will be the same if you increase your voice's volume, which in turn morphs the pitch and possibly the colour of the voice as well. If in one take the microphone is positioned slightly different, such changes will not be perceivable neither by the volume level bars nor by the direct sound feedback from the hardware/software interface. It will only be noticeable once both tracks(or if the software; like Cubase LE, supports it, takes) are bound together, and played back a few seconds before the bending point.
I have two microphones, one was held by hand, the other by a stand. So apart from having all main objects placed as before around my desk(including sound attenuating sheets), my main trouble arouse from my hand getting tired and slowly distancing the microphone from the pop filter(which is right after my lips), so when the first section is over, that small distance difference altered the sound's overall tone, and putting the microphone back near the pop filter did not always coincide with the position that I previously held in the previous take.
Of course that the solution was to wrap the first microphone to the second one in the stand, in such a way that I could simply perform with my voice without worrying about the right microphone's distance.
It is a 7:42.500 audio track, which basically makes it impossible to do it in one straight dialogue line, unless one possesses the equivalent of 10 normal human lungs, so various takes are inevitable.
The other possibility is to avoid video tutorials longer than 2:00 minutes.
Other tutors prefer to create naturally occurring pauses that are heard in everyday dialogue, however, I do not opt for that because it prolongs unnecessarily the duration of the video and it becomes more tedious for the audience. For those who are slower to learn, they can either watch the video more times or lower the playback rate as I show in this video.