This is a look into how certain EXEPACK-related programs handle the min_extra_paragraphs field in the EXE header. This field is also known as e_minalloc in IMAGE_DOS_HEADER terms.

Main exepack page

Last updated:

Microsoft EXEPACK.EXE 4.00

I found a copy of EXEPACK.EXE at the PCjs page for Microsoft Macro Assembler 4.00. There are various other versions available. Select the dropdown for the B: drive, ensure MS Macro Assembler 4.00 is selected, click Load, then Save.

WinWorld is another source for disk images. The Internet Archive has a "Microsoft MASM 4 beta", whose differences from 4.00 I did not examine thoroughly.

Use Mtools to examine and extract the disk image:

$ mdir -i MASM-016014-400.img
 Volume in drive : has no label
Directory for ::/

MASM     EXE     85566 1985-10-16   4:00
LINK     EXE     43988 1985-10-16   4:00
SYMDEB   EXE     37021 1985-10-16   4:00
MAPSYM   EXE     18026 1985-10-16   4:00
CREF     EXE     15028 1985-10-16   4:00
LIB      EXE     28716 1985-10-16   4:00
MAKE     EXE     24300 1985-10-16   4:00
EXEPACK  EXE     10848 1985-10-16   4:00
EXEMOD   EXE     11034 1985-10-16   4:00
COUNT    ASM      5965 1985-10-16   4:00
README   DOC      7630 1985-10-16   4:00
       11 files             288 122 bytes
                             69 632 bytes free

$ mkdir MASM-016014-400
$ cd MASM-016014-400
$ mcopy -i ../MASM-016014-400.img -s ::/ ./

Inside DOSBox or similar, you can run the programs and see the version numbers.

C:\>MASM.EXE
Microsoft (R) Macro Assembler  Version 4.00
Copyright (C) Microsoft Corp 1981, 1983, 1984, 1985.  All rights reserved.

C:\>LINK.EXE
Microsoft (R) 8086 Object Linker  Version 3.05
Copyright (C) Microsoft Corp 1983, 1984, 1985.  All rights reserved.

C:\>EXEPACK.EXE
Microsoft (R) EXE File Compression Utility  Version 4.00
Copyright (C) Microsoft Corp 1985.  All rights reserved.

The disk image conveniently comes with a sample program, COUNT.ASM. Let's EXEPACK-compress it two ways, using EXEPACK.EXE and the /EXEPACK option to LINK.EXE.

C:\>MASM.EXE COUNT.ASM,COUNT.OBJ;

C:\>LINK.EXE COUNT.OBJ,COUNT.EXE;

C:\>EXEPACK.EXE COUNT.EXE COUNTE.EXE

C:\>LINK.EXE /EXEPACK COUNT.OBJ,COUNTL.EXE;

The two compressed files are not identical. Using Rabin2 and Radiff2, we see that there are only trivial differences:

$ du -b COUNT*.EXE
3081    COUNT.EXE
1092    COUNTE.EXE
1092    COUNTL.EXE

$ sha256sum COUNT*.EXE
10e86814a369a9cf12e7d0ea6930fdf3184692e4cdddae7627aea9ba0add4624  COUNT.EXE
ab629d01a7e99e20153b6dd85c87f5adb9fa211c4daa2a6cc67cc12772973ba1  COUNTE.EXE
548bc5075fc8e98acf2f53e903619ca7e9595a543618fb9a75b45621743bf1b5  COUNTL.EXE

$ rabin2 -H COUNT.EXE
[0000:0000]  Signature           MZ
[0000:0002]  BytesInLastBlock    0x0009
[0000:0004]  BlocksInFile        0x0007
[0000:0006]  NumRelocs           0x0001
[0000:0008]  HeaderParagraphs    0x0020
[0000:000a]  MinExtraParagraphs  0x0000
[0000:000c]  MaxExtraParagraphs  0xffff
[0000:000e]  InitialSs           0x0000
[0000:0010]  InitialSp           0x0100
[0000:0012]  Checksum            0xfdf4
[0000:0014]  InitialIp           0x000c
[0000:0016]  InitialCs           0x0094
[0000:0018]  RelocTableOffset    0x001e
[0000:001a]  OverlayNumber       0x0000

$ rabin2 -H COUNTE.EXE
[0000:0000]  Signature           MZ
[0000:0002]  BytesInLastBlock    0x0044
[0000:0004]  BlocksInFile        0x0003
[0000:0006]  NumRelocs           0x0000
[0000:0008]  HeaderParagraphs    0x0020
[0000:000a]  MinExtraParagraphs  0x0098
[0000:000c]  MaxExtraParagraphs  0xffff
[0000:000e]  InitialSs           0x00b5
[0000:0010]  InitialSp           0x0080
[0000:0012]  Checksum            0x1399
[0000:0014]  InitialIp           0x0010
[0000:0016]  InitialCs           0x0011
[0000:0018]  RelocTableOffset    0x001e
[0000:001a]  OverlayNumber       0x0000

$ rabin2 -H COUNTL.EXE
[0000:0000]  Signature           MZ
[0000:0002]  BytesInLastBlock    0x0044
[0000:0004]  BlocksInFile        0x0003
[0000:0006]  NumRelocs           0x0000
[0000:0008]  HeaderParagraphs    0x0020
[0000:000a]  MinExtraParagraphs  0x0098
[0000:000c]  MaxExtraParagraphs  0xffff
[0000:000e]  InitialSs           0x00b5
[0000:0010]  InitialSp           0x0080
[0000:0012]  Checksum            0x0000
[0000:0014]  InitialIp           0x0010
[0000:0016]  InitialCs           0x0011
[0000:0018]  RelocTableOffset    0x001e
[0000:001a]  OverlayNumber       0x0000

$ radiff2 COUNTE.EXE COUNTL.EXE
0x00000012 9913 => 0000 0x00000012
0x00000301 00 => c3 0x00000301

Either way, compression has changed the value of the min_extra_paragraphs field (which Rabin2 calls MinExtraParagraphs) from 0x0000 to 0x0098 (152 decimal). Where does this come from? The formula for the size of the program text is

512×(blocks_in_file − 1) + bytes_in_last_block − 16×header_paragraphs

The formula for the size of the additional memory is

16×min_extra_paragraphs

Adding these two values together gives the total runtime size of the program.

fileprogram sizeextra sizetotal size
COUNT.EXE256902569
COUNTE.EXE58024323012

The difference in program sizes accounts for 124 of the 152 paragraphs in the min_extra_paragraphs of COUNTE.EXE. The remaining 28 paragraphs come from the size of the EXEPACK block itself and its 8-paragraph stack (see initial_sp).

In this case, min_extra_paragraphs had to increase to account for the overhead of the EXEPACK block. But if the original min_extra_paragraphs is large enough, the EXEPACK block can make use of the same space, and therefore the difference in min_extra_paragraphs is simply the difference in program sizes.

The formula used by EXEPACK.EXE to compute the new min_extra_paragraphs is:

out.min_extra_paragraphs = in.program_paragraphs + max(in.min_extra_paragraphs, exepack_paragraphs+8) − out.program_paragraphs

The formula comes from reverse engineering part of the program:

                fix_exe_header:
0be6  55            push bp
0be7  8bec          mov bp, sp
0be9  b80a00        mov ax, 10
                    ; Reserve space for local variables.
                    ; bp-10 uint16_t exepack_paragraphs
                    ; bp-8  uint16_t out_file_size_low
                    ; bp-6  uint16_t out_file_size_high
0bec  e8ed04        call stack_check
0bef  57            push di
0bf0  56            push si
0bf1  a1c42b        mov ax, word [exepack_size]
0bf4  050f00        add ax, 15
0bf7  b104          mov cl, 4
0bf9  d3e8          shr ax, cl
0bfb  8946f6        mov word [exepack_paragraphs], ax   ; exepack_paragraphs = (exepack_size+15)/16
0bfe  b80200        mov ax, 2
0c01  50            push ax
0c02  2bc0          sub ax, ax
0c04  50            push ax
0c05  50            push ax
0c06  ff36bc2b      push word [out_fd]
0c0a  e8db06        call file_seek
0c0d  83c408        add sp, 8
0c10  8946f8        mov word [out_file_size_low], ax
0c13  8956fa        mov word [out_file_size_high], dx   ; out_file_size = file_seek(out_fd, 0, 0, SEEK_END)
0c16  80e401        and ah, 1
0c19  a3a802        mov word [out_bytes_in_last_block], ax      ; out_bytes_in_last_block = out_file_size % 512
0c1c  8b46f8        mov ax, word [out_file_size_low]
0c1f  05ff01        add ax, 511
0c22  83d200        adc dx, 0
0c25  b109          mov cl, 9
0c27  e87005        call shr_long
0c2a  a3aa02        mov word [out_blocks_in_file], ax   ; out_blocks_in_file (out_file_size+511)/512
0c2d  a1c02b        mov ax, word [in_exe_size_low]
0c30  8b16c22b      mov dx, word [in_exe_size_high]
0c34  b104          mov cl, 4
0c36  e86105        call shr_long               ; dx:ax = in_exe_size/16
0c39  8b4ef6        mov cx, word [exepack_paragraphs]
0c3c  03c8          add cx, ax
0c3e  890eb402      mov word [out_ss], cx       ; out_ss = in_exe_size/16 + exepack_paragraphs
0c42  c706b6028000  mov word [out_sp], 0x80     ; out_sp = 0x80
0c48  a15007        mov ax, word [compressed_paragraphs]
0c4b  0106bc02      add word [out_cs], ax       ; out_cs += compressed_paragraphs
0c4f  a1c02b        mov ax, word [in_exe_size_low]
0c52  8b16c22b      mov dx, word [in_exe_size_high]
0c56  b104          mov cl, 4
0c58  e83f05        call shr_long               ; dx:ax = in_exe_size/16
0c5b  8b4ef6        mov cx, word [exepack_paragraphs]
0c5e  83c108        add cx, 8                   ; cs = exepack_paragraphs+8
0c61  8bf8          mov di, ax
0c63  3b0e5c07      cmp cx, word [in_min_extra_paragraphs]      ; exepack_paragraphs+8 >= in_min_extra_paragraphs?
0c67  7305          jae l1
                    ; in_min_extra_paragraphs is greater.
0c69  a15c07        mov ax, word [in_min_extra_paragraphs]      ; ax = in_min_extra_paragraphs
0c6c  eb06          jmp set_min_extra_paragraphs
                l1:
                    ; exepack_paragraphs+8 is greater.
0c6e  8b46f6        mov ax, word [exepack_paragraphs]
0c71  050800        add ax, 8                                   ; ax = exepack_paragraphs+8
                set_min_extra_paragraphs:
0c74  2b46f6        sub ax, word [exepack_paragraphs]           ; ax -= exepack_paragraphs
0c77  03c7          add ax, di                                  ; ax += in_exe_size/16
0c79  2b065007      sub ax, word [compressed_paragraphs]        ; ax -= compressed_paragraphs
                    ; out_min_extra_paragraphs = in_exe_size/16 + max(in_min_extra_paragraphs, exepack_paragraphs+8) - (compressed_paragraphs + exepack_paragraphs)
0c7d  a3b002        mov word [out_min_extra_paragraphs], ax
0c80  a15e07        mov ax, word [in_max_extra_paragraphs]
0c83  a3b202        mov word [out_max_extra_paragraphs], ax     ; out_max_extra_paragraphs = in_max_extra_paragraphs

Microsoft EXEPACK.EXE will refuse to run if the output file would be bigger than the input file. My exepack program does support this, though, so it uses a slightly more complicated formula (which is equivalent in the case that out.program_paragraphsin.program_paragraphs):

out.min_extra_paragraphs = max(
in.program_paragraphs + in.min_extra_paragraphs,
in.program_paragraphs + exepack_paragraphs + 8,
out.program_paragraphs + exepack_paragraphs + 8
) − out.program_paragraphs

UNP 4.11

When UNP decompresses a file, it sets min_extra_paragraphs according to the formula

out.min_extra_paragraphs = max(0x1000, in.program_paragraphs + 512 + in.min_extra_paragraphs) − 512 − out.program_paragraphs

In the case of a largish program that has in.program_paragraphs + in.min_extra_paragraphs ≥ 0x1000 − 512, the formula simplifies to

out.min_extra_paragraphs = in.program_paragraphs + in.min_extra_paragraphsout.program_paragraphs

This computation can be read from the file u4.asm in the UNP source code. The MoreStrucInfo label sets TotalMem = max(0x1000, in_ExeImageSz/16 + EXTRAMEM + in_MinParMem):

MoreStrucInfo:
                ; ...
		mov	ds,SegEHInfo.A
		ASSUME	ds:NOTHING
		mov	ax,ExeImageSz
		mov	dx,ExeImageSz+2
		div	ParSize			;; ax = ExeImageSz / 16
		xor	dx,dx
		add	ax,EXTRAMEM		;; ax += 512
		adc	dl,0
		add	ax,ds:[MinParMem]	;; ax += MinParMem
		adc	dl,0
		or	dx,dx			; size above 1Mb ?
		jne	LoadError
		cmp	ax,01000h		; 64K?
		jae	UseMem
		mov	ax,01000h		;; ax = 0x1000
UseMem:
		mov	TotalMem,ax		;; TotalMem = ax

The CalcSize label then does out_MinParMem = TotalMemEXTRAMEM − (out_ExeImageSz+1)/16:

CalcSize:
		mov	ax,ProgFinalSeg		; calculate new image size
		xor	dx,dx
		sub	ax,SegProgram
		sbb	dx,0
		mov	cx,4
LongMul16:
		shl	ax,1
		rcl	dx,1
		loop	LongMul16		;; dx:ax = (ProgFinalSeg - SegProgram) * 16

		add	ax,ProgFinalOfs
		adc	dx,0			;; dx:ax += ProgFinalOfs
		add	ax,ExeSizeAdjust
		adc	dx,[ExeSizeAdjust+2]	;; dx:ax += 1 (not sure what this is for)
		mov	ExeImageSz,ax
		mov	ExeImageSz+2,dx

		div	ParSize
		xchg	ax,bx			;; bx = ExeImageSz/16
		mov	ax,TotalMem		;; ax = TotalMem
		sub	ax,EXTRAMEM		;; ax += EXTRAMEM
		sub	ax,bx			;; ax -= ExeImageSz/16
		cmp	ax,0A000h
		jb	MinMemOk
		xor	ax,ax			; no minimal memory
MinMemOk:
		cmp	HeaderStored,0
		jne	_label01
		mov	es:[MinParMem],ax	;; MinParMem = ax

See it in action using the compressed COUNTE.EXE from the EXEPACK.EXE section:

C:\>UNP.EXE -v COUNTE.EXE COUNTEU.EXE

UNP 4.11 Executable file restore utility, written by Ben Castricum, 05/30/95

INFO - DOS Version 5.00
INFO - Commandline = "E -I -K+ -U -V COUNTE.EXE COUNTEU.EXE".
INFO - Using UNPTEMP$.$$$ as temp file.
INFO - Wildcard matches 1 filename(s), stored at 0000h.
INFO - Program loaded at 0192h, largest free memory block: 632123 bytes.

processing file : COUNTE.EXE
DOS file size   : 1092
file-structure  : executable (EXE)
EXE part sizes  : header 512 bytes, image 580 bytes, overlay 0 bytes
INFO - File uses 0 fixups and requires atleast 3012 bytes to load.
INFO - Loading program at 1010h, blocksize 65536 bytes.
INFO - Required mem. 0098h, desired mem. FFFFh, header slack 484 bytes.
processed with  : EXEPACK V4.00
action          : decompressing... done
new file size   : 2608
writing to file : COUNTEU.EXE
$ rabin2 -H COUNTE.EXE
[0000:0000]  Signature           MZ
[0000:0002]  BytesInLastBlock    0x0044
[0000:0004]  BlocksInFile        0x0003
[0000:0006]  NumRelocs           0x0000
[0000:0008]  HeaderParagraphs    0x0020
[0000:000a]  MinExtraParagraphs  0x0098
[0000:000c]  MaxExtraParagraphs  0xffff
[0000:000e]  InitialSs           0x00b5
[0000:0010]  InitialSp           0x0080
[0000:0012]  Checksum            0x1399
[0000:0014]  InitialIp           0x0010
[0000:0016]  InitialCs           0x0011
[0000:0018]  RelocTableOffset    0x001e
[0000:001a]  OverlayNumber       0x0000
$ rabin2 -H COUNTEU.EXE
[0000:0000]  Signature           MZ
[0000:0002]  BytesInLastBlock    0x0030
[0000:0004]  BlocksInFile        0x0006
[0000:0006]  NumRelocs           0x0001
[0000:0008]  HeaderParagraphs    0x0002
[0000:000a]  MinExtraParagraphs  0x0d5f
[0000:000c]  MaxExtraParagraphs  0xffff
[0000:000e]  InitialSs           0x0000
[0000:0010]  InitialSp           0x0100
[0000:0012]  Checksum            0x1399
[0000:0014]  InitialIp           0x000c
[0000:0016]  InitialCs           0x0094
[0000:0018]  RelocTableOffset    0x001c
[0000:001a]  OverlayNumber       0x0000

TotalMem is set to

max(0x1000, (512×0x0002+0x0044 − 16×0x0020)/16 + 512 + 0x0098)
= max(0x1000, 580/16 + 512 + 152)
= max(0x1000, 700) = 0x1000.

The size of this program is below UNP's minimum memory threshold. Then out_MinParMem becomes

0x1000 − 512 − (512×0x0005+0x0030 − 16×0002 + 1)/16
= 0x1000 − 512 − 2577/16
= 3423 = 0xd5f.

Because UNP does not round up its in_ExeImageSz and out_ExeImageSz to a multiple of 16 before dividing, it may compute a value of out_MinParMem that is 1 paragraph smaller than it should be.