Skip over navigation

How to dynamically add data to an executable file

Contents

Introduction

In article #2 we discussed how it is often useful to distribute data that is embedded in a program's executable file. That article solved the problem by writing the data to resource files and linking them into the program at compile time. This article solves the same problem in a different way – by appending the data to the executable file. This has the advantage that the data doesn't need to be linked in at compile time and can be added or updated dynamically. Data can be attached to an already compiled file. The only disadvantage of this method is that it's slightly harder to read the data at run time than it is when using resources.

A typical use for this technique would be in creating install programs. We have the actual installer as a program stub and append the files to be installed to the end of the stub. The installer stub would contain code to extract the files.

Overview

The Windows PE file format permits additional data to be appended to the executable file. This data is ignored by the Windows program loader, so we can use the data for our own purposes without affecting the program code. For the purposes of this article let's call this data the payload.

The problem to be solved is how to denote that an executable file has a payload and how to find out what size it is. We must be able to do this without modifying the executable portion of the file. Following Daniel Polistchuck we will use a special record to identify the payload. This record will follow the payload data.

So, an executable file that contains a payload has three main components (in order):

  1. The original executable file.
  2. The payload data.
  3. A footer record that identifies that a payload is present and records the size of the executable code and the payload.

Our task then, is to produce code that can create, modify, read, delete and check for the existence of, a payload. To enable this we must be able to detect, read and update the payload footer.

We begin our discussion by investigating how to handle this footer.

Payload footer record

In the overview we noted that the payload footer has three purposes:

  1. To identify that a payload is present.
  2. To record the size of the payload data.
  3. To record the size of the original executable file.

Listing 1 defines a Pascal record that records all the required information.

 1type
 2  TPayloadFooter = packed record
 3    WaterMark: TGUID;
 4    ExeSize: LongInt;
 5    DataSize: LongInt;
 6  end;
Listing 1

The purpose of the fields is as follows:

  • The Watermark field is a "magic number" that is used to identify the fact that a payload is present. A GUID is used to try to ensure that the watermark is as unique as possible, to reduce the likelihood that the last few bytes of an executable file will be detected incorrectly as a payload footer. This field is always set to the same fixed value. In a while we will see how this field is used to detect a payload.
  • The ExeSize field stores the size of the original executable file before the payload was appended. The field also specifies the offset of the start of the payload data in the file, since the payload immediately follows the executable code.
  • The DataSize field stores the size of the payload itself. This is in fact redundant information – it's value can be deduced from the value of the ExeSize field, the size of the file and the size of the TPayloadFooter record. However by providing this field we can simplify subsequent code.

We will often need to create a new, blank, footer record that contains the correct watermark. Listing 2 illustrates a simple helper procedure that initializes such a blank footer.

  1const
  2  // arbitrary watermark constant: must not be all-zeroes
  3  cWaterMarkGUID: TGUID = '{9FABA105-EDA8-45C3-89F4-369315A947EB}';
  4  
  5procedure InitFooter(out Footer: TPayloadFooter);
  6begin
  7  FillChar(Footer, SizeOf(Footer), 0);
  8  Footer.WaterMark := cWaterMarkGUID;
  9end;
Listing 2

The routine simply zeroes the footer record and stores the required watermark in it.

Let us now consider how to check for the presence of a payload in a file. This is done by reading the final SizeOf(TPayloadFooter) bytes from the file into a TPayloadFooter record and checking if the Watermark field contains the expected magic number. If this is the case it is safe to assume we have a payload and that the record provides valid information about the size of the payload and executable file.

Listing 3 shows the code of a helper routine that both checks for presence of a payload on an open file and gets the footer if so. The routine operates on a standard Pascal un-typed file.

  1function ReadFooter(var F: File; out Footer: TPayloadFooter): Boolean;
  2var
  3  FileLen: Integer;
  4begin
  5  // Check that file is large enough for a footer
  6  FileLen := FileSize(F);
  7  if FileLen > SizeOf(Footer) then
  8  begin
  9    // Big enough: move to start of footer and read it
 10    Seek(F, FileLen - SizeOf(Footer));
 11    BlockRead(F, Footer, SizeOf(Footer));
 12  end
 13  else
 14    // File not large enough for footer: zero it
 15    // .. this ensures watermark is invalid
 16    FillChar(Footer, SizeOf(Footer), 0);
 17  // Check if watermark is valid
 18  Result := IsEqualGUID(Footer.WaterMark, cWaterMarkGUID);
 19end;
Listing 3

ReadFooter first gets the size of the file and checks if it is large enough to contain a payload footer. If the file is large enough it moves the file pointer to SizeOf(TPayloadFooter) bytes back from the end of the file and reads in the footer record. If the file is too small then the routine fills a footer record with zeros, making it invalid (i.e. the watermark is zero). Finally the routine checks if the record's Watermark field contains the required GUID and returns the result. The footer record is passed out as a parameter.

We now have enough information to enable us to move on to develop a class that helps us to manage payloads.

Payload management class

In this section we will develop a class that lets us manage payloads. Let us first define the requirements of the class. They are:

  • To check if an executable file contains a payload.
  • To find the size of the payload data.
  • To extract the payload from a file into a suitably sized buffer.
  • To delete a payload from a file.
  • To store payload data in a file.

The class is declared as follows:

  1type
  2  TPayload = class(TObject)
  3  private
  4    fFileName: string;
  5    fOldFileMode: Integer;
  6    fFile: File;
  7    procedure Open(Mode: Integer);
  8    procedure Close;
  9  public
 10    constructor Create(const FileName: string);
 11    function HasPayload: Boolean;
 12    function PayloadSize: Integer;
 13    procedure SetPayload(const Data; const DataSize: Integer);
 14    procedure GetPayload(var Data);
 15    procedure RemovePayload;
 16  end;
Listing 4

The public methods are:

  • Create – Creates an object to work on a named file.
  • HasPayload – Returns true if the file contains a payload.
  • PayloadSize – Returns the size of the payload. This information is required when allocating a buffer into which payload data can be read using GetPayload.
  • SetPayload – Copies a specified number of bytes from a buffer and stores them as a payload at the end of the file. Overwrites any existing payload.
  • GetPayload – Copies the file's payload into a given buffer. The buffer must be large enough to store all the payload. The required buffer size is given by PayloadSize.
  • RemovePayload – Deletes any payload from the file and removes the footer record. This method restores file to its original condition.

In addition there are two private helper methods:

  • Open – Opens the file in a specified mode.
  • Close – Closes the file and restores the original file mode.

The class also has three fields:

  • fFileName – Stores the name of the file we are manipulating.
  • fOldFileMode – Preserves the current Pascal file mode.
  • fFile – Pascal file descriptor that records the details of an open file.

As can be seen from the discussion of the fields we will be using standard un-typed Pascal file routines to manipulate the file.

We will discuss the implementation of the class is several chunks. We begin in Listing 5 where we look at the constructor, some required constants and the two private helper methods.

  1const
  2  // Untyped file open modes
  3  cReadOnlyMode = 0;
  4  cReadWriteMode = 2;
  5
  6constructor TPayload.Create(const FileName: string);
  7begin
  8  // create object and record name of payload file
  9  inherited Create;
 10  fFileName := FileName;
 11end;
 12
 13procedure TPayload.Open(Mode: Integer);
 14begin
 15  // open file with given mode, recording current one
 16  fOldFileMode := FileMode;
 17  AssignFile(fFile, fFileName);
 18  FileMode := Mode;
 19  Reset(fFile, 1);
 20end;
 21
 22procedure TPayload.Close;
 23begin
 24  // close file and restore previous file mode
 25  CloseFile(fFile);
 26  FileMode := fOldFileMode;
 27end;
Listing 5

The two constants define the two Pascal file modes we will require. The constructor simply records the name of the file associated with the class. The Open method first stores the current file mode then opens the file using the required file mode. Finally, Close closes the file and restores the original file mode.

We next consider the two methods that provide information about a file's payload – PayloadSize and HasPayload:

  1function TPayload.PayloadSize: Integer;
  2var
  3  Footer: TPayloadFooter;
  4begin
  5  // assume no data
  6  Result := 0;
  7  // open file
  8  Open(cReadOnlyMode);
  9  try
 10    // read footer and if valid return data size
 11    if ReadFooter(fFile, Footer) then
 12      Result := Footer.DataSize;
 13  finally
 14    Close;
 15  end;
 16end;
 17
 18function TPayload.HasPayload: Boolean;
 19begin
 20  // we have a payload if size is greater than 0
 21  Result := PayloadSize > 0;
 22end;
Listing 6

The only method here of any substance is PayloadSize. We first assume a payload size of zero in case there is no payload. Next we open the file in read mode and attempt to read the footer. The ReadFooter helper routine is used to do this. If the footer is read successfully we get the size of the payload from the footer record's DataSize field. The file is then closed.

HasPayload simply calls PayloadSize and checks if the payload size it returns is greater than zero.

Now we move on to consider GetPayload which is described in Listing 7. This method's Data parameter is a data buffer which must have a size of at least PayloadSize bytes.

  1procedure TPayload.GetPayload(var Data);
  2var
  3  Footer: TPayloadFooter;
  4begin
  5  // open file as read only
  6  Open(cReadOnlyMode);
  7  try
  8    // read footer
  9    if ReadFooter(fFile, Footer) and (Footer.DataSize > 0) then
 10    begin
 11      // move to end of exe code and read data
 12      Seek(fFile, Footer.ExeSize);
 13      BlockRead(fFile, Data, Footer.DataSize);
 14    end;
 15  finally
 16    // close file
 17    Close;
 18  end;
 19end;
Listing 7

GetPayload opens the file in read only mode and tries to read the footer record. If we succeed in reading a footer and the payload contains data we move the file pointer to the start of the payload, then read the payload into the Data buffer. Note that we use the footer record's ExeSize field to perform the seek operation and the DataSize field to determine how many bytes to read. The method ends by closing the file.

Finally we examine the implementation of the two methods that modify the file – RemovePayload and SetPayload.

  1procedure TPayload.RemovePayload;
  2var
  3  PLSize: Integer;
  4  FileLen: Integer;
  5begin
  6  // get size of payload
  7  PLSize := PayloadSize;
  8  if PLSize > 0 then
  9  begin
 10    // we have payload: open file and get size
 11    Open(cReadWriteMode);
 12    FileLen := FileSize(fFile);
 13    try
 14      // seek to end of exec code and truncate file there
 15      Seek(fFile, FileLen - PLSize - SizeOf(TPayloadFooter));
 16      Truncate(fFile);
 17    finally
 18      Close;
 19    end;
 20  end;
 21end;
 22
 23procedure TPayload.SetPayload(const Data; const DataSize: Integer);
 24var
 25  Footer: TPayloadFooter;
 26begin
 27  // remove any existing payload
 28  RemovePayload;
 29  if DataSize > 0 then
 30  begin
 31    // we have some data: open file for writing
 32    Open(cReadWriteMode);
 33    try
 34      // create a new footer with required data
 35      InitFooter(Footer);
 36      Footer.ExeSize := FileSize(fFile);
 37      Footer.DataSize := DataSize;
 38      // write data and footer at end of exe code
 39      Seek(fFile, Footer.ExeSize);
 40      BlockWrite(fFile, Data, DataSize);
 41      BlockWrite(fFile, Footer, SizeOf(Footer));
 42    finally
 43      Close;
 44    end;
 45  end;
 46end;
Listing 8

RemovePayload checks the existing payload's size and proceeds only if a payload is present. If so the file is opened for writing and it's size is noted. We then seek to the end of the executable part of the file and truncate it before closing the file. We have calculated the end of the executable section by deducting the payload size and the size of the footer record from the file length. We could also have read the footer and simply used the value of its ExeSize field.

SetPayload takes two parameters: a data buffer (Data) and the size of the buffer (DataSize). The method begins by using RemovePayload to remove any existing payload, ensuring that the file contains only the executable code. If the payload contains some data we open the file for writing. A new payload record is then initialized using the InitFooter helper routine, then the sizes of both the executable file and of the new payload are stored in the record. Finally we append the payload data and the footer record to the file before closing it.

Now we have created the TPayload class it is easy to manipulate payload data. Unfortunately we must read and write the whole of the payload at once, which is not always convenient. An improvement would be to enable random access to the data. That's what we do next.

Random payload access

The "Delphi way" of providing random access to data is to derive a class from TStream and to override its abstract methods – and this is what we will do here. Our new class will be called TPayloadStream. It will detect payload data and provide read / write random access to it.

Not only does this approach provide random access but it also has the added advantage of hiding the details of how the payload is implemented from the user of the class. All the user sees is the familiar TStream interface while all the gory details are hidden in TPayloadStream's implementation.

Listing 9 shows the definition of the new class, along with an enumeration – TPayloadOpenMode – that is used to determine whether a payload stream object is to read or write the payload data. Note that in addition to overriding TStream's abstract methods, TPayloadStream also overrides the virtual SetSize method to enable the user to change the size of the payload. This is necessary because, by default, SetSize does nothing.

  1type
  2  TPayloadOpenMode = (
  3    pomRead, // read mode
  4    pomWrite // write (append) mode
  5  );
  6
  7  TPayloadStream = class(TStream)
  8  private
  9    fMode: TPayloadOpenMode; // stream open mode
 10    fOldFileMode: Integer; // preserves old file mode
 11    fFile: File; // handle to exec file
 12    fDataStart: Integer; // start of payload data in file
 13    fDataSize: Integer; // size of payload
 14  public
 15    // opens payload of file in given open mode
 16    constructor Create(const FileName: string; 
 17      const Mode: TPayloadOpenMode);
 18    // close file, updating data in write mode
 19    destructor Destroy; override;
 20    // moves to specified position in payload
 21    function Seek(Offset: LongInt; Origin: Word): LongInt; override;
 22    // sets size of payload in write mode only
 23    procedure SetSize(NewSize: LongInt); override;
 24    // Reads count bytes from payload
 25    function Read(var Buffer; Count: LongInt): LongInt; override;
 26    // Writes count bytes to payload in write mode only
 27    function Write(const Buffer; Count: LongInt): LongInt; override;
 28  end;
Listing 9

The public methods of TPayloadStream are:

  • Create – Creates a TPayloadStream and opens the named file either in read or write mode.
  • Destroy – Updates the payload footer, closes the file and destroys the stream object.
  • Seek – Moves the stream's pointer to the specified position in the payload data, ensuring that the pointer remains within the payload.
  • SetSize – Sets the size of the payload in write mode only. Raises an exception when used in read mode. Note that setting the size to zero will remove the payload and the associated footer record.
  • Read – Attempts to read a specified number of bytes into a buffer. If there is insufficient data in the payload then just the remaining bytes are read.
  • Write – Writes a specified number of bytes from a buffer to the payload, extending the payload if required. Works only in write mode – an exception is raised in read mode.

The class also uses the following private fields:

  • fMode – Records whether the stream is open for reading or writing.
  • fOldFileMode – Preserves the current Pascal file mode.
  • fFile – Pascal file descriptor that records the details of an open file.
  • fDataStart – Offset of the start of payload data from the start of the executable file.
  • fDataSize – Size of payload data.

We begin our review of the class's implementation by examining Listing 10, which shows the constructor and destructor. Once again we are using classic Pascal un-typed files to perform the underlying physical access to the executable file. However this could easily be changed to use some other file access techniques.

  1constructor TPayloadStream.Create(const FileName: string;
  2  const Mode: TPayloadOpenMode);
  3var
  4  Footer: TPayloadFooter; // footer record for payload data
  5begin
  6  inherited Create;
  7  // Open file, saving current mode
  8  fMode := Mode;
  9  fOldFileMode := FileMode;
 10  AssignFile(fFile, FileName);
 11  case fMode of
 12    pomRead: FileMode := 0;
 13    pomWrite: FileMode := 2;
 14  end;
 15  Reset(fFile, 1);
 16  // Check for existing payload
 17  if ReadFooter(fFile, Footer) then
 18  begin
 19    // We have payload: record start and size of data
 20    fDataStart := Footer.ExeSize;
 21    fDataSize := Footer.DataSize;
 22  end
 23  else
 24  begin
 25    // There is no existing payload: start is end of file
 26    fDataStart := FileSize(fFile);
 27    fDataSize := 0;
 28  end;
 29  // Set required file position per mode
 30  case fMode of
 31    pomRead: System.Seek(fFile, fDataStart);
 32    pomWrite: System.Seek(fFile, fDataStart + fDataSize);
 33  end;
 34end;
 35
 36destructor TPayloadStream.Destroy;
 37var
 38  Footer: TPayloadFooter; // payload footer record
 39begin
 40  if fMode = pomWrite then
 41  begin
 42    // We're in write mode: we need to update footer
 43    if fDataSize > 0 then
 44    begin
 45      // We have payload, so need a footer record
 46      InitFooter(Footer);
 47      Footer.ExeSize := fDataStart;
 48      Footer.DataSize := fDataSize;
 49      System.Seek(fFile, fDataStart + fDataSize);
 50      Truncate(fFile);
 51      BlockWrite(fFile, Footer, SizeOf(Footer));
 52    end
 53    else
 54    begin
 55      // No payload => no footer
 56      System.Seek(fFile, fDataStart);
 57      Truncate(fFile);
 58    end;
 59  end;
 60  // Close file and restore old file mode
 61  CloseFile(fFile);
 62  FileMode := fOldFileMode;
 63  inherited;
 64end;
Listing 10

In the constructor the first thing we do is to record the open mode and then open the underlying file in the required mode. Next we try to read a payload footer record, using the ReadFooter function we developed in Listing 3.

If we have found a footer we a get the start of the payload data (fDataStart) and the size of the payload (fDataSize) from the footer's ExeSize and DataSize fields respectively. If there is no footer record then we have no payload, so we set fDataStart to refer to just beyond the end of the file and set fDataSize to zero.

Setting fDataStart

fDataStart is the same as the size of the executable file because payloads always start immediately after the executable code.

Finally the constructor sets the file pointer according to the file mode – in read mode we set it to the start of the payload while in write mode we set it to the end.

In the destructor we proceed differently according to whether we are in read mode or write mode:

  • In read mode all there is to do is to close the file and restore the previous Pascal file mode.
  • In write mode we first check if we actually have a payload (fDataSize > 0). If so, we create a footer record using the InitFooter routine we defined in Listing 2 and record the payload size and start position in the record. We then seek to the end of the new payload data, truncate any data that falls beyond its end (as will be the case when the data size has shrunk), then write the footer. If there is no payload data we truncate the file at the end of the executable code. Finally we close the file and restore the file mode.

That completes the discussion of the class constructor and destructor. Let us now consider how we override the abstract Seek, Read and Write methods. Listing 11 has the details:

  1function TPayloadStream.Seek(Offset: Integer;
  2  Origin: Word): LongInt;
  3begin
  4  // Calculate position in payload after move
  5  // (this is result value)
  6  case Origin of
  7    soFromBeginning:
  8      // Moving from start of stream: ignore -ve offsets
  9      if Offset >= 0 then
 10        Result := Offset
 11      else
 12        Result := 0;
 13    soFromEnd:
 14      // Moving from end of stream: ignore +ve offsets
 15      if Offset <= 0 then
 16        Result := fDataSize + Offset
 17      else
 18        Result := fDataSize;
 19    else // soFromCurrent and other values
 20      // Moving from current position
 21      Result := FilePos(fFile) - fDataStart + Offset;
 22  end;
 23  // Result must be within payload: make sure it is
 24  if Result < 0 then
 25    Result := 0;
 26  if Result > fDataSize then
 27    Result := fDataSize;
 28  // Perform actual seek in underlying file
 29  System.Seek(fFile, fDataStart + Result);
 30end;
 31
 32function TPayloadStream.Read(var Buffer;
 33  Count: Integer): LongInt;
 34var
 35  BytesRead: Integer; // number of bytes read
 36  AvailBytes: Integer; // number of bytes left in stream
 37begin
 38  // Work out how many bytes we can read
 39  AvailBytes := fDataSize - Position;
 40  if AvailBytes < Count then
 41    Count := AvailBytes;
 42  // Read data from file and return bytes read
 43  BlockRead(fFile, Buffer, Count, BytesRead);
 44  Result := BytesRead;
 45end;
 46
 47function TPayloadStream.Write(const Buffer;
 48  Count: Integer): LongInt;
 49var
 50  BytesWritten: Integer; // number of bytes written
 51  Pos: Integer; // position in stream
 52begin
 53  // Check in write mode
 54  if fMode <> pomWrite then
 55    raise EPayloadStream.Create(
 56      'TPayloadStream can''t write in read mode.');
 57  // Write the data, recording bytes read
 58  BlockWrite(fFile, Buffer, Count, BytesWritten);
 59  Result := BytesWritten;
 60  // Check if stream has grown
 61  Pos := FilePos(fFile);
 62  if Pos - fDataStart > fDataSize then
 63    fDataSize := Pos - fDataStart;
 64end;
Listing 11

Seek is the most complicated of the three methods. This is because the FilePos and Seek routines (that we use to get and set the file pointer) operate on the whole file, while our stream positions must be relative to the start of the payload data. We must also ensure that the file pointer cannot be set outside the payload data. The case statement contains the code that calculates the required offset within the payload, depending on the seek origin. The two lines following the case statement constrain the offset within the payload data. Finally we perform the actual seek operation on the underlying file, offset from the start of the payload data. The method returns the new offset relative to the payload.

The Read method must ensure that the read falls wholly within the payload data. We can't assume that all the remaining bytes in the stream can be read, because the payload may be followed by a footer record that is not part of the data. Therefore we calculate the number of available bytes by subtracting the current position in the payload from the size of the payload data. If there is insufficient data to meet the request, the number of bytes to be read is reduced to the number of available bytes.

Note that Read uses TStream's Position property to get the current position in the payload data. This property calls the Seek method which, as we have seen, ensures that the position returned falls within the payload data.

Write is quite simple – we just check we are in write mode and output the data to the underlying file at the current position if so. The number of bytes written is returned. The only complication is that we must check if the write operation took us beyond the end of the current data and record the new data size if so. Should the stream be in read mode then Write raises an exception.

All that remains to do now is to override the SetSize method. Listing 12 provides the implementation.

  1procedure TPayloadStream.SetSize(NewSize: Integer);
  2var
  3  Pos: Integer; // current position in stream
  4begin
  5  // Check for write mode
  6  if fMode <> pomWrite then
  7    raise EPayloadStream.Create(
  8      'TPayloadStream can''t change size in read mode.');
  9  // Update size, adjusting position if required
 10  if NewSize < fDataSize then
 11  begin
 12    Pos := Position;
 13    fDataSize := NewSize;
 14    if Pos > fDataSize then
 15      Position := fDataSize;
 16  end;
 17end;
Listing 12

Obviously, we can't change the stream size in read mode, so we raise an exception in this case. In write mode we only record the new size if it is less than the current payload size. In this case we must also check if the current stream position falls beyond the end of the reduced payload and move the position to the end of the truncated data if so. The Position property is used to get and set the stream position. As noted earlier, this property calls our overridden Seek method.

Why can't SetSize increase the payoad size?

Prohibiting SetSize from extending the payload data is a design decision I took, because enlarging the data leaves the problem of having to write padding bytes to the payload. What should those bytes be? Zeros? Random data? I think it's reasonable to assume that the payload should only be extended by explicitly appending data to it using the Write method.

If your view is different, then I leave implementation to you as an exercise!

This completes our presentation of the TPayloadStream class.

Demo code

A demo program to accompany this article can be found in the delphidabbler/article-demos Git repository on GitHub.

You can view the code in the article-07 sub-directory. Alternatively download a zip file containing all the demos by going to the repository's landing page and clicking the Clone or download button and selecting Download ZIP.

See the demo's README.md file for details.

The demo does not currently contain the source code for, or an example of using, TPayloadStream.

This source code is merely a proof of concept and is intended only to illustrate this article. It is not designed for use in its current form in finished applications. The code is provided on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.

The demo is open source. See the demo's LICENSE.md file for licensing details.

Feedback

I hope you found this article useful.

If you have any observations, comments, or have found any errors there are two places you can report them.

  1. For anything to do with the article content, but not the downloadable demo code, please use this website's Issues page on GitHub. Make sure you mention that the issue relates to "article #7".
  2. For bugs in the demo code see the article-demo project's README.md file for details of how to report them.