In a class I took at my university one time we built a two pass assembler, which is a very simple form of a compiler.
You could make an assembler and in this way you would learn how x86 assembly instructions are mapped to binary.
In other words, you would read up on the x86 instruction set (there are undoubtedly many papers on the subject from Intel/AMD) and then you could do the translations I described above. Just as a practice exercise.
I am not personally familiar with the x86 instruction set because it is much more complicated than the processor I was working with in my architecture courses. We used the MIPS R3000 processor which is quite different from the x86. The MIPS R3000 has a 5 stage pipeline (as I recall) whereas some x86 processors have pipelines with 20-30 stages or more. (i.e. Pentium 4) As a result, x86 assembly tends to be a lot more complicated than MIPS assembly, because the underlying processor is much more complex.
Anyway, the only other detail that I myself profess ignorance about is how to produce the loading code. You see, in addition to the binary that I mentioned generating above, a compiler also generates certain other 'semantic' information that an operating system reads to determine the 'entry point' to the code. (In other words, where to begin execution) Once this is determined, a portion of the executable code runs that is generated by the compiler. This loads the instruction data and other data into memory before actual program execution begins. You'd have to read up on the specifics of the loader mechanism for a Windows program.
__________________ Desktop machine: 2 x Opteron 246, Asus K8N-DL, 2GB PC3200 ECC Reg., XFX GeForce 6600GT, 74gb WD Raptor, 2 x 19\" LCDs, Windows XP x64
Server machine: Intel P4 3.0GHz 2MB EM64T, ECS i865pe, 1GB PC3200, 36gb WD Raptor, Windows Server 2003
Laptop: Dell Inspiron 9100 (Intel P4 3.2GHz 1MB Prescott, i865pe, 512MB PC3200, Mobility Radeon 9700, DVD+R/DL Burner), Windows XP
Linux: P3 450Mhz, 386MB ram, Slackware 10.1 (Running mySQL/Apache) |