Working with Different File Formats buyers guide

It is a common scenario: “I’d like the application to save the report as an Excel spreadsheet”, says the client. Another frequent request is for Adobe PDF output, often used for emailing invoices and other documents thanks to its reliable rendering across a broad range of platforms. Web applications may need to generate charts in PNG format, or deliver a dynamically assembled download as a ZIP file.

You may also need to handle files as input rather than output. Someone, or another application, might create an Excel spreadsheet, for example, from which you have to gather structured data to insert into a database. All these tasks require you to understand and work with file formats.

File formats can be both highly complex and poorly documented. If you have ever sat down to write a file format parser or generator from scratch, you will know how difficult this can be. Typically, file formats exist in multiple versions, and in some cases – the old binary Microsoft Office formats, for example – were never officially documented, leaving developers to puzzle them out by reverse engineering. Some formats that are documented, like Rich Text Format, have quirks that you can only discover the hard way, when your output does not work as expected.

Another issue is that what the user perceives as a single document may in fact contain multiple formats, such as documents which contain images. The AVI (Audio Video Interleave) video format is really a specification for a multimedia container which may contain a variety of media formats. By contrast, there are open formats like PNG (Portable Network Graphics) which are well documented and relatively easy to handle.

Fortunately it is rarely necessary to work from scratch. This is one of those cases where developers are better off using an existing library or component.

Read more about working with different file formats.