Overview

What Is It For?

XTabulator is a light, flexible tool intended to help you manipulate, modify, mix, match, and otherwise edit comma- and tab-separated files, which are the usual way of "dumping" data from an application or database for sharing with other applications or databases. The common approach is to use a spreadsheet application, which "supports" reading and writing tabular files as an afterthought but is otherwise poorly-adapted to the task of manipulating them.

What Isn't It For?

XTabulator is not a spreadsheet application. It does not perform calculations. XTabulator deals in rows of columnar text: it won't add, multiply, or average anything.

What Is Tabular Data?

Tabular data files are a popular, generic way of passing any type of data in table (tabular) form for any other application to read. This very basic format has been in use for decades and can be thought of as a primitive form of XML. Many applications today allow users to export data to - or import from - common tabular data file formats such as comma-separated values (CSV), and tab-separated values (TAB).

CUSTID LASTNAME FIRSTNAME STREET CITY COUNTRY
001 Smith John 4112 Some Street Anytown USA
002 Doe Jane 8372 Some Street Anytown USA
003 Doe John 8372 Some Street Anytown USA
An example data table

You can think of tabular data as records in a database. Records are separated by one specific character while each record's fields are separated by another. Using the comma-separated values format (where fields are separated by the comma character and records are separated by a hard return), the data might appear as follows:

CUSTID,LASTNAME,FIRSTNAME,STREET,CITY,COUNTRY
001,Smith,John,4112 Some Street,Anytown,USA
002,Doe,Jane,8372 Some Street,Anytown,USA
003,Doe,John,8372 Some Street,Anytown,USA

Since the structure of this data can never be known, any application that opens such a file needs to be told how the tabular data is represented. A human can look at tabular data files and can usually make sense of it (we see that there are three people listed, complete with their addresses and a customer number - we can infer this information with a quick scan of our eyes), but a computer needs a bit of guidance. There is no way for the computer to 'know' whether the first record contains the headers or is actually the first data record in the file.

To further complicate things, any character can be used as the separator character for both records and rows. Using the pipe (|) character as the field separator (called the "delimiter"), the same data from the example above would look like this:

CUSTID|LASTNAME|FIRSTNAME|STREET|CITY|COUNTRY
001|Smith|John|4112 Some Street|Anytown|USA
002|Doe|Jane|8372 Some Street|Anytown|USA
003|Doe|John|8372 Some Street|Anytown|USA

The advantages of tabular data files are portability and human-readability. The disadvantage, however, is the relative ease with which you can introduce errors. For example, in the case of the comma-separated value format, what happens if one of the values has a comma in it (such as a "NAME" field which contains the last name, a comma, and the first name)? The field's contents could be treated as two separate fields (and all subsequent fields in that record are shifted to the right). Such fields are expected to be "quoted" in order to be interpreted correctly (see Quoted Fields) but it is up to the user to verify the data once it is displayed. For more information on how XTabulator manages inconsistent data, see How Errors Are Handled.