Never used BMP (never used Word, for that matter!), so I have no experience with it. However, my guess is that it's similar to the "PICT" format used in early Macs. It's simply a bit for bit copy of the file. PDF, of course, is a proprietary format for the text with font and styling formatting as well as page layout and images. Lots of extra data here besides just an image. JPEG (both varieties) are 'lossy' formats where some data is lost every time a file is saved (even if it is not edited). So, depending on what the image is, a file can be much smaller than the original.
Scanning, of course, usually results in an image of some format (jpeg, tiff, png, etc.) so the "text" is no longer really text, it's simply an image of the text. If you want actual, editable text (usually without much, if any formatting) you'll need OCR functions to convert the character images back into real ASCII characters. That may or may not create a different sized file from what the scanner produced. Remember, a scanned file size is very dependent on the resolution used.