<file>
<byte>
, <unsigned-byte>
,
<short>
, <unsigned-short>
,
<int>
, <unsigned-int>
,
<long>
, <unsigned-long>
,<float>
, <double>
,
<packed-decimal>
, <zoned-decimal>
,
<fixed-byte>
, <fixed-short>
,
<fixed-int>
, <fixed-long>
.
<string>
<flags-byte>
,
<flags-short>
,
<flags-int>
,
<flags-long>
.<group>
<block>
<array>
<data>
<ascii-int>
<label>
<color>
<time>
, <date>
<tab-group>
1. Introduction |
Go to Top |
The file format definitions that allow FileCarver to edit or create files
of various types are defined using the eXtended Markup Language (XML). XML files
are simply text files with the .xml
extension that follow the
XML format. Although this guide attempts to explain in detail, with many
examples, how XML file format definitions work, it is good to be already at
least a little bit familiar with the general syntax and structure of XML documents.
A lot of information, both for beginners and experts, on the XML standard is
available at http://www.xml.com/.
In addition to this guide, an XML Schema file is included with FileCarver (named 'xml_schema.xsd') that specifies the structure of file format definitions. FileCarver will automatically use this schema to validate file format definitions on startup, and will report any errors it encounters. It is also possible to use an external XML editor that can validate file format definitions against the provided XML Schema as you edit them. One example of such a program is EditiX XML editor.
2. It All Begins With The File Tag |
Go to Top |
Each file format definition file should begin with a <file>
tag.
This tag acts as a container for the different fields that are specified
in the file. It can have the optional attribute name
, which
is used to customize how the file type will be displayed in the list by
FileCarver. When all elements of the file have been placed after the
opening tag, the file tag must then be closed. The order the field tags
are placed inside the file tag is the same order as FileCarver will
read and write those fields from/to the binary file.
<file name="PC Info File"> <int name="Some Value"/> <string name="Some Other Value" length="20"/> </file>
With the optional attribute format
on the <file>
tag,
you may specify the default byte order of numeric fields in the file. Possible values are
'big-endian' and 'little-endian'. If the format
attribute is not present on
the file tag, the numeric fields will use the 'big-endian' byte order by default. Note:
You may always override the byte order on a field-by-field basis.
<file name="PC Info File" format="little-endian"> <int name="Field1" format="big-endian"/> <long name="Field2"/> </file>
3. Numeric Field Types |
Go to Top |
The following numeric types are available for use in FileCarver:
Field | Bytes | Type | Min Value | Max Value |
---|---|---|---|---|
byte | 1 | integer | -127 | 128 |
unsigned-byte | 1 | integer | 0 | 255 |
short | 2 | integer | -32768 | 32767 |
unsigned-short | 2 | integer | 0 | 65535 |
int | 4 | integer | -2147483648 | 2147483647 |
unsigned-int | 4 | integer | 0 | 4294967295 |
long | 8 | integer | -9223372036854775808 | 9223372036854775807 |
unsigned-long | 8 | integer | 0 | 18446744073709551615 |
float | 4 | floating point | Negative Real | Positive Real |
double | 8 | floating point | Negative Real | Positive Real |
fixed-byte | 1 | fixed point | Negative Real | Positive Real |
fixed-short | 2 | fixed point | Negative Real | Positive Real |
fixed-int | 4 | fixed point | Negative Real | Positive Real |
fixed-long | 8 | fixed point | Negative Real | Positive Real |
packed-decimal | variable | decimal | Unbounded | Unbounded |
zoned-decimal | variable | decimal | Unbounded | Unbounded |
Attribute | Description | Example |
---|---|---|
name | The name of the field, as will be labeled in the user interface. Can consist of any characters allowed in an XML attribute value. | <int name="Strength"/> |
id | The id of the field, which can be used to reference this field from other fields. Can consist of a number of lower-case characters, digits and underscores. Must start with a lower-case character. | <byte id="c_age23"/> |
value |
The value that an instance of this field will contain when a new file is created.
The default value, when not specified, is 0 for numeric types. The specified value
must be within the range for the field type. For integer field types (see table),
the default value may be specified in hexadecimal, by appending the 0x
prefix. For example, value="0xFF" (the 0x prefix will be
replaced by as as many 0's as necessary to match the needed number of bytes of the
field).
|
<float value="2.3"/> |
editable | Specifies whether this field is editable. If unspecified, the default is always true. You may specify false to make a field read-only. | <short editable="false"/> |
hidden | Specifies whether this field will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the field hidden (the value in that field will still be preserved when opening and saving the file). | <double hidden="true"/> |
format |
Specifies the endiannes of the field. Not applicable for packed-decimal
and zoned-decimal fields. There are two possible values for this attribute
for numeric types: big-endian and little-endian . The default
value corresponds to the value that was set for the format attribute on
the file tag, or big-endian if no such attribute has been set.
If your file format stores values in the little-endian format (for example, the same
way that they would be stored in RAM on an Intel x86 CPU), then you should specify
the little-endian format. This attribute is not applicable for byte fields.
|
<int format="little-endian"/> |
Attribute | Description | Example |
---|---|---|
min | When specified, limits the minimum value of the field (as can be set in the user interface), to the specified value. This value must be in the range of valid values for this field, and must be less than or equal to the max value, if specified. | <int min="-273"/> |
max | When specified, limits the maximum value of the field (as can be set in the user interface), to the specified value. This value must be in the range of valid values for this field, and must be greater than or equal to the min value, if specified. | <byte max="100"/> |
zoned-decimal
and packed-decimal
field types are
commonly used in mainframes and in COBOL applications. These fields can consist of
multiple bytes and can have an implicit decimal point positioned at a specific
location. The zoned-decimal
type generally uses one byte per digit,
while the packed-decimal
type stores two digits in a single byte. The
following additional attributes are available for zoned-decimal
and
packed-decimal
fields:
Attribute | Description | Example |
---|---|---|
length |
The length attribute is required. Specifies the length in bytes of the field.
|
<packed-decimal length="3"/> |
decimals | Specifies the number of decimal digits that this field has. This effectively places a decimal point at that position, when the field's value is displayed in the user interface. The default value is 0. | <zoned-decimal length="5" decimals="2"/> |
A note about the 'long' type: The 'long' type is an 8-byte integer, which only
corresponds to C's 'long a;'
when the C code has been compiled in 64-bit mode for
64-bit processors. On 32-bit machines, 'long a;'
is actually a 4 byte integer, and
is identical to 'int a;'
. An 8-byte long may be declared in C on 32-bit machines
using 'long long a;'
. This corresponds to FileCarver's 'long' type, which is also
identical to Java's 'long' type. FileCarver's 'unsigned long' type has no primitive equivalent in
Java. More information on C data types is available
at this location,
while general information about integer types can be found
at this page on Wikipedia.
4. The String Field Type |
Go to Top |
The <string>
tag is used to denote a field in the file that will
hold a string of characters. This field type supports a number of attributes to specify
the exact format of the string.
Attribute | Description | Example |
---|---|---|
name | The name of the field, as will be labeled in the user interface. Can consist of any characters allowed in an XML attribute value. | <string name="First Name" ... /> |
id | The id of the field, which can be used to reference this field from other fields. Can consist of a number of lower-case characters, digits and underscores. Must start with a lower-case character. | <string id="f_name" ... /> |
editable | Specifies whether this field is editable. If unspecified, the default is always true. You may specify false to make a field read-only. | <string editable="false" ... /> |
hidden | Specifies whether this field will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the field hidden (the value in that field will still be preserved when opening and saving the file). | <string hidden="true" ... /> |
In addition to the standard attributes listed above, the string field supports a number of special attributes that are described below:
Attribute | Description |
---|---|
value |
This attribute specifies the value that this field will contain when a new file
is created. This attribute also supports escape-sequences for special characters,
such as \n for newline, \t for tab, and so forth. In addition, since these values
are specified in an XML file, you can use some XML codes to get characters such as
the double quote to show up. Here are some useful examples of different types of
codes and escape sequences you can use:
|
length |
For string fields, the
When you specify 'variable' as the value of the
Finally, and only if this is the last field in the file or enclosing block, you may
set the value of the
|
format |
The format of the field can be set to one of the following:
|
terminator |
Can be used when the format attribute is set to 'terminated'. Specifies
the character to use to mark the end of the string. The default value is '\0' which
is equivalent to the integer value 0. The value specified may be any single character,
or one of the following sequences for special values:
|
padchar |
Should be used when the format attribute is set to either 'padded',
'left-padded', or 'right-padded'. Specifies the character used for padding the
string. If this attribute is not present, string padding will be done with the
space character. The value specified may be any single character, or one of the
following sequences for special values:
|
lengthfield |
Should be used when the length attribute is set to 'variable' and the
format attribute is set to 'length-specified'. In these circumstances,
this attribute must be set to the same value that is specified in the id
attribute of the field that will store the length of the string. That field must
come before the string field for which it specifies the value. Field ids are scoped,
thus the first field with the matching id in the parent container of the string
field (file, group, or array element) will be used when found. Such a field must exist.
NOTE: Currently, FileCarver does not support more than one field relying on the same length field. |
encoding |
This attribute specifies the character encoding that the string is using. The
following values are supported:
|
display |
The display attribute for a string field can have two values: 'text-field'
and 'text-area'. Setting it to 'text-field' is the equivalent of not including
the attribute, as this is its default value. When set to 'text-field', the string
will be displayed on a single line in the user interface, whereas the 'text-area'
setting will make it span multiple lines, which is desirable for longer strings.
|
5. Bit Flag Field Types |
Go to Top |
FileCarver supports variants of the byte, short, int and long fields that are used
specifically for storing a set of boolean values, ie: flags. Within the graphical
user interface, such fields will be displayed as sets of checkboxes (each flag
being a separate checkbox). These fields support the same attributes as their
normal counterparts in addition to being able to contain <bit>
tags.
Type | Bytes | Bits | Min Bit Pos | Max Bit Pos |
---|---|---|---|---|
flag-byte | 1 | 8 | 0 | 7 |
flag-short | 2 | 16 | 0 | 15 |
flag-int | 4 | 32 | 0 | 31 |
flag-long | 8 | 64 | 0 | 63 |
<bit>
tag
is used. This tag supports the following attributes:
Attribute | Description | Example |
---|---|---|
name | The name of this flag bit, which will be the name displayed next to the checkbox for this flag in the user interface. This attribute is required. | <bit name="Can Fly" ... /> |
position | This is the position of the bit within the field. A value between 0 and the maximum bit position supported by the field type (see previous table) is required. No two bits within a single field should specify the same position. This field is required. | <bit position="2" ... /> |
value | Specifies the initial value of the bit, either 1 (set) or 0 (not set) when a new file is created. The default value (when not specified) is 0. | <bit value="1" ... /> |
editable | Specifies whether this flag bit is editable. If unspecified, the default is always true. You may specify false to make the flag read-only. | <bit editable="false" ... /> |
hidden | Specifies whether this flag bit will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the bit flag hidden (the value in that field will still be preserved when opening and saving the file). | <bit hidden="true" ... /> |
<flag-byte name="Characteristics"> <bit position="0" name="Is Evil" value="1"/> <bit position="1" name="Can Fly" value="0"/> <bit position="2" name="Bipedal" hidden="true"/> <bit position="3" name="Humanoid" editable="false"/> </flag-byte>
6. Field Groups |
Go to Top |
Fields can be grouped together with the <group>
tag. This
works by simply placing the tag as a container around any number of field
tags, or other group or array (see later) tags.
<group> <int name="Some Value"/> <string name="Some Other Value" length="20"/> </group>
Grouping tags together has several advantages. First, by using the optional
name
attribute on a group tag, you can put a titled border around
a set of elements.
<group name="Values"> <int name="Some Value"/> <string name="Some Other Value" length="20"/> </group>
The optional display
attribute on a group tag currently has four
choices for values: 'normal
', 'horizontal
,
'collapsable
', 'collapsed
' and 'window
'.
Setting the display
attribute to 'normal
' is equivalent
to not specifying the attribute at all (in other words, it is the default value).
The 'normal
' display type will lay out the fields one after the other,
just as fields are layed out normally in the file
tag.
The 'horizontal
' value on the display
attribute will
cause the group to lay out its fields horizontally, one after the other, rather
than vertically as would be done otherwise.
The 'collapsable
' and 'collapsed
' values will display
the field group in a way that lets the user click on a triangle to hide and show the fields
in the group (the difference between the two values is the initial state).
Finally, display="window"
will display the fields of a group in a separate
window, with a button on the main window to open it.
<group name="Values" display="window"> <int name="Some Value"/> <string name="Some Other Value" length="20"/> </group>
Another advantage of the group tag, is that id
attributes of fields
contained by the group will be in the scope of that group, which allows flexibility
when referencing fields by their ids from other tags. Group tags can also be nested
easily.
<group name="Values" display="window"> <int name="Some Value"/> <string name="Some Other Value"/> <group name="More Values"> <int name="A Third Value"/> <string name="Something Else" length="20"/> </group> </group>
The <group>
tag also supports the attributes editable
and hidden
in the same way that other fields do.
7. Field Blocks |
Go to Top |
Field blocks are like field groups, but have an additional length
attribute
that specifies the exact amount of bytes a field block takes up in the file. You may refer
to the previous section on field groups for information that is common to both field groups
and field blocks.
The length of the block can be specified in two ways - both using the length
attribute. The first way is to specify a literal positive integer as the length of the
block field.
<block length="128"> <string name="First Value" length="variable" format="terminated"/> <string name="Second Value" length="variable" format="terminated"/> </block>
In the above example, the block will always take up 128 bytes in the file. Within the block are two null-terminated strings of variable length. If the combined length of the two strings is less than 128 bytes, then the remaining length will be skipped when reading the file, and will be filled with 0's when writing to the file. If the combined length of the two strings is greater than 128 bytes when writing, the contents of the block field will be truncated to 128 bytes.
The second way to specify the length of the block field is with a JavaScript expression
that will be evaluated when reading the block field and when writing the block field to
determine its length. This is done by placing the expression inside the parentheses of
eval()
, and settings this as the value of the length
attribute.
<int name="Twice the Block Length" id="twicelen"/> <block length="eval(twicelen/2)"> <string name="First Value" length="variable" format="terminated"/> <string name="Second Value" length="variable" format="terminated"/> </block>
In the above example, the JavaScript expression twicelen/2
will be
evaluated to determine the length of the block prior to reading the block as well as
prior to writing the block. Please note that the fields referenced by the JavaScript
expression to specify the length of the block do NOT get automatically updated as the
contents of the block change, and must instead be updated manually (if desired), using
onupdate events on the contents of the block whose length may change.
Finally, any fields that support setting the length
attribute to
the value remainder
(strings, arrays, data fields, etc), when placed
as the last element of a block, will be read until the end of the block, but not
further. This allows for greater flexibility in limiting the length for such fields.
8. Field Arrays |
Go to Top |
The <array>
tag is similar to the <group>
tag,
in the sense that it acts as a container around other field tags, group tags and
array tags. The difference is that an array tag allows you to have more than one
set of the fields it contains. Three basic types of arrays are supported:
display="merge"
on an array
that is parallel to another array.Attribute | Description |
---|---|
name | The name of the array, if specified, will be used to put a titled border around it. |
length |
The
If the number of elements in the array can vary, there are two possibile values
you can set the
Finally, and only if this array is the last tag in the file or enclosing block, you
can set the value of the
|
lengthfield |
Must be specified when the length attribute is set to 'variable'. This
attribute must be set to the same value that is specified in the id
attribute of the field that will store the length of the array. That field must
come before the array field for which it specifies the value. Field ids are scoped,
which means the first field with the matching id in the parent container of this
array field (file, group, or array element) will be used when found. Such a field
must exist.
NOTE: Currently, FileCarver does not support more than one field relying on the same length field. |
parallelto |
When this attribute is specified, the 'length' field must be omitted, and the value
of this attribute must be equivalent the value that is specified in the id
attribute of the array field to which this array will be parallel. A parallel array
may (but does not have to) have its display attribute set to 'merge'.
|
termwhen |
This attribute specifies a JavaScript Expression that is evaluated,
in the scope of each element of the array, to determine when the array is terminated.
When the expression evaluates to true, the array is considered terminated. The
expression must evaluate to true on the default values (as specified in the XML
definition) of an element in this array, as these values will be used to terminate
the array, when writing to a file. Two types of terminated array are possible.
Terminated arrays of variable length must have their 'length' attribute
set to 'variable' . When read from file, each element from such an array
is read and the 'termwhen' expression is evaluated. When the expression
evaluates to true, no further elements of the array are read from the file, and the
next field after the array is then read.
'length' attribute
specify a positive integer as the maximum number of elements they are to contain.
When read from file, there are always that many elements (as specified by the length)
read, but the number of active elements is determined by the 'termwhen'
expression.
|
rowname |
Specifies the name of each row in the array, as will be displayed in the user
interface. If unspecified, the default value is 'Element', producing a user interface
that labels its rows in the following manner: 'Element 0', 'Element 1', 'Element 2',
etc. If this attribute is present, the value of the attribute will be used as a basis
for labeling the rows of the array.
eval() and set that as the
rowname attribute. The JavaScript will be executed in the scope of the specific
row of the array, so you may refer to specific fields simply by their IDs. Also,
you may find out the index of the current array element by using
this.element_index . You can use this to have array elements be
numbered from 1 instead of zero in the list, as shown in the example below.
|
display |
This attribute may be ommitted. If it is omitted, or if set to 'normal', then
the array will display normally. However, and only if this is a parallel array
with the parallelto attribute present and the length
attribute omitted, you may set the display attribute to 'merge'.
When this is done, instead of this array getting a dedicated user interface
control, the fields of each element of this array will be appended to the
corresponding element in the master array, and will be displayed in the same
user interface as the element of the master array.
|
onupdate | This attribute, when specified, allows a JavaScript expression to be executed when the number of elements in the array changes. For example, this could be used to update a calculated field that displays the number of elements in the array. For more information, refer to the sections in this guide that cover onupdate events and calculated fields. |
8. Data Field Type |
Go to Top |
FileCarver supports a <data>
field type tag, which specifies
that a portion of the file is data of fixed or variable length. The editor for
this field type in the graphical user interface is a hex editor. This tag can be
used for editing data which is in complex formats that FileCarver does not yet
support, or (when set to be hidden) to just skip portions of the file that should
not be editted.
The following attributes are supported by the <data>
tag:
Attribute | Description |
---|---|
name |
The name of the field, used to label it in the user interface.
|
id |
The id of the field, which can be used to reference this field from other fields.
Can consist of a number of lower-case characters, digits and underscores. Must start
with a lower-case character.
|
editable |
Specifies whether this field is editable. If unspecified, the default is always true.
You may specify false to make a field read-only.
|
hidden |
Specifies whether this field will be hidden from viewing and editing. If unspecified,
the default value is always false. You may specify true to make the field hidden (the
value in that field will still be preserved when opening and saving the file).
|
value |
Specifies the starting value of this data field, when a new file is first created.
If unspecified, every byte in the data will be zero. Otherwise, you may enter a
hexadecimal value that will be set when a new file is made. The value is limited
to the characters 0-9 , a-f , and A-F , with an
optional 0x prefix. If the prefix 0x is specified,
zeroes will be inserted before the hex value to satisfy the specified length,
otherwise zeroes will be inserted after the hex value. If the field is of variable
length, and the specified value is an odd number of characters, then a single
zero will be appended to make the number of digits even, and therefore byte divisable.
|
length |
The
If the size of the data can vary, there are two possibile values you can set
the
Finally, and only if this data field is the last tag in the file or enclosing block, you
can set the value of the
|
lengthfield |
Must be specified when the length attribute is set to 'variable'. This
attribute must be set to the same value that is specified in the id
attribute of the field that will store the length of the data field. That field must
come before the data field for which it specifies the value. Field ids are scoped,
which means the first field with the matching id in the parent container of this
data field (file, group, or array element) will be used when found. Such a field
must exist.
NOTE: Currently, FileCarver does not support more than one field relying on the same length field. |
10. ASCII Int Field Type |
Go to Top |
FileCarver supports a special variation of the 4-byte <int>
field: the <ascii-int>
field. This works in the same way as
the <int>
field, except for how the data is displayed and
editted. Unlike the <int>
field, which treats the 4 bytes as
a binary number, that is then editted in decimal, the <ascii-int>
field treats those 4 bytes as four ASCII characters, and edits them as text. All
tag attributes are the same as for the <int>
field, except
for the value attribute, which must be in the form of 4 characters, if specified.
As an exception, entering 'NONE
' as the value will actually produce
the integer value -1, and not the characters 'N
', 'O
',
'N
', 'E
'.
11. Label Field Type |
Go to Top |
The <label>
tag can be used to insert a descriptive
string of text into the graphical user interface that will be presented to
the user. This field is ignored when reading or writing from/to the file,
and serves merely to present information to the user. The message displayed
by the label is specified by setting the 'name' attribute.
<label name="Please fill in the following information."/>
12. Calculated Fields |
Go to Top |
FileCarver supports calculated fields - these are fields that are neither read from the file, nor written to the file. Instead, the values of these fields are calculated at run-time from the values of other fields.
To specify that a field is calculated, simply set the format
attribute on the field tag to the value 'calculated
'. You may
also wish to set the attribute editable
to 'false
'
on the field
The value of a calculated field should be set by another field's
onupdate
event. In the example below, a calculated field
will display the total of the values from the two fields before it:
<group> <double name="Material Cost" id="material" onupdate="this.parent.total.value = this.value + this.parent.labor.value" /> <double name="Labor Cost" id="labor" onupdate="this.parent.total.value = this.value + this.parent.material.value" /> <double name="Total Cost" format="calculated" editable="false" id="total"/> </group>
13. Color Field Type |
Go to Top |
The <color>
tag represents a calculated field that can
be used to select a color, using a color picker control. Since colors may be
stored in a multitude of different formats, the color field is a calculated
field that is not actually stored in the file, but gets its value from other
fields.
The color field, thus, expects its value to be set by one or more
onupdate
events on other fields, while the onupdate
on the color field should be used to set the appropriate values back to the
fields storing the color information.
The color field exposes three attributes that can be retrieved and set with
JavaScript: red
, green
and blue
. Each of
these attribute is a floating point value between 0.0 and 1.0, corresponding to
the respective RGB component of this color.
<group> <unsigned-byte name="Red" id="cred" hidden="true" onupdate="this.parent.col.red = this.value/255.0" /> <unsigned-byte name="Green" id="cgreen" hidden="true" onupdate="this.parent.col.green = this.value/255.0" /> <unsigned-byte name="Blue" id="cblue" hidden="true" onupdate="this.parent.col.blue = this.value/255.0" /> <color name="Color" id="col" onupdate=" cred.value = this.red*255; cgreen.value = this.green*255; cblue.value = this.blue*255; "/> </group>
14. Time and Date Field Types |
Go to Top |
The <time>
and <date>
tags represent
calculated fields that can be used to display and manipulate times and dates.
Since times and dates may be stored in a multitude of different formats, these
fields are calculated and are not actually stored in the file, but rather they
receive their values from other fields.
These fields thus expect their values to be set by one or more
onupdate
events on other fields, while the onupdate
event on the <time>
and <date>
fields
should be used to set the appropriate values back to the fields storing the
time or date information.
The <time>
field exposes three attributes that can be retrieved and set with
JavaScript:
JavaScript Property | Description |
---|---|
hour | The number of hours. Values are integers ranging from 0 to 23. |
minute | The number of minutes. Values are integers ranging from 0 to 59. |
second | The number of seconds. Values are integers ranging from 0 to 59. |
The <date>
field exposes three attributes that can be retrieved and set with
JavaScript:
JavaScript Property | Description |
---|---|
year | The year as an integer (ex: 2007). |
month | The month as an integer value from 1 to 12 (January to December). |
day | The day of the month, ranging from 1 to 31. |
In the example below, the <time>
field is used to display the
time stored in a packed 2-byte DOS Time value (as is used in ZIP files, for example):
<unsigned-short name="Last mod file time" id="dt" hidden="true" onupdate=" this.parent.mtime.second = (this.value & 0x1F) * 2; this.parent.mtime.minute = (this.value & 0x7E0) / 0x20; this.parent.mtime.hour = (this.value & 0xF800) / 0x800; "/> <time name="File Last Modified on (Time)" id="mtime" onupdate=" this.parent.dt.value = ((this.hour)<<11) + ((this.minute)<<5) + (this.second/2); "/>
In the example below, the <date>
field is used to display the
date stored in a packed 2-byte DOS Date value (as is used in ZIP files, for example):
<unsigned-short name="Last mod file date" id="dd" hidden="true" onupdate=" this.parent.mdate.day = (this.value & 0x1F); this.parent.mdate.month = (this.value & 0x1E0) / 0x20; this.parent.mdate.year = 1980 + (this.value & 0xFE00) / 0x200; "/> <time name="File Last Modified on (Date)" id="mdate" onupdate=" this.parent.dd.value = ((this.year - 1980)<<9) + ((this.month)<<5) + this.day; "/>
15. Tabbed Field Groups |
Go to Top |
The <tab-group>
tag allows a grouping of fields that will be
presented in a tabbed interface. Unlike regular <group>
tags, the <tab-group>
may only contain inner tags of type
<group>
. The name of each such tag will then be used as
the title of the tab for that group.
<tab-group> <group name="Tab 1"> ... </group> <group name="Tab 2"> ... </group> <group name="Tab 3"> ... </group> </tab-group>
16. Displaying Fields as Lists |
Go to Top |
For all numeric field types, the ascii-int field type, and the string field type,
FileCarver supports a display mode in which the user will select the value for that
field from a popup list. This is done by setting the display
attribute on
the tag element to the value 'list', and then nesting special <option>
tags inside the tag that will be displayed in this way. Each option tag takes a required
value
attribute, which supports the same range of values as the value
attribute on the parent field tag. The description of the option to be displayed to the
user should be placed between the starting and ending tags of the option.
<int name="Occupation" display="list"> <option value="0">Farmer</option> <option value="1">Carpenter</option> <option value="2">Blacksmith</option> </int>
This will present the user with a drop-down list containing the three choices: Farmer, Carpenter, or Blacksmith. When the user selects one choice, the number specified in the value attribute of the option selected will be set for that field.
If you want the user to be able to enter any custom value that was not necessarily
included in the definition, you can set the display
attribute to 'combo'.
<int name="Occupation" display="combo"> <option value="0">Farmer</option> <option value="1">Carpenter</option> <option value="2">Blacksmith</option> </int>
This will allow the user to either select one of the three occupations, or enter a value other than 0, 1 or 2 directly into the textbox.
In addition to the required value
attribute, option tags also support
the following other attributes.
Attribute | Description | Example |
---|---|---|
selectable | Specifies whether this option is selectable. If set to false, it will be greyed out in the user interface. When not specified, the value defaults to true. | <option value="1" selectable="false">Red</option> |
hidden | Specifies whether this option is hidden. If set to true, not, it will not be displayed at all as one of the choices in the user interface. When not specified, the value defaults to false. | <option value="0" hidden="true">Humanoid</option> |
For fields that have options, it is possible to retrieve the labels of an option via
JavaScript based on the option value. This can be done by indexing into the associative
options
array on the field.
<int name="Occupation" display="list" onupdate="this.parent.occtxt.value=this.options[this.value]"> <option value="0">Farmer</option> <option value="1">Carpenter</option> <option value="2">Blacksmith</option> </int> <string name="Occupation Text" id="occtxt" length="variable" format="calculated"/>
Additionally, it is also possible to display integer fields with an up/down spinner control. The
following fields can be displayed in this manner: byte
,
unsigned-byte
, short
, unsigned-short
,
int
, unsigned-int
, long
and
unsigned-long
.
This will allow the user to increment and decrement the value of the field with a set of up and down arrows.
<int name="Age" display="spinner" value="20" />
17. OnUpdate Events for Fields |
Go to Top |
FileCarver supports user-configurable onupdate
events for fields.
These are JavaScript code snippets that are specified on the onupdate
attribute for a field, that perform a certain action when the value of the field
is updated.
The JavaScript code may perform normal JavaScript calculations, as per the
regular JavaScript rules, and may also update the values of other fields. A good
use of JavaScript onupdate
events is to set values for calculated
fields.
<flag-byte name="Info" id="info"> <bit position="0" name="Is Old" value="0"/> <bit position="1" name="Has Children" value="0"/> </flag-byte> <int name="Age" id="age" value="20" onupdate=" if (this.value > 80) { info.bits[1] = 1; } else { info.bits[1] = 0; } "/>
For more information and examples of onupdate
events, see
the section on calculated fields.
18. Conditional Existence of Fields |
Go to Top |
There are times when the existence of a field in a binary file depends on a previous field having some specific value. For example, you may have a flag field which, when a certain bit is set, indicates that there is an extra field following it.
FileCarver offers full support for such formats, with the existsif
attribute which is available on all top level fields. Top level fields
are all tags that can be put directly under a <file>
tag.
The value of the existsif
attribute is treated as a JavaScript expression (more below)
that is executed in the scope of the parent field to evaluate to either true or false. When
it evaluates to true, the field exists; when it does not, the field does not exist. When a
field does not exist, it is neither read from the file, nor written to the file.
<flag-byte name="Info" id="info"> <bit position="0" name="Has Values" value="1"/> <bit position="1" name="Has Description" value="0"/> </flag-byte> <group existsif="info.bits[0] != 0" name="Values"> <unsigned-int name="Salary" id="salary"/> <string name="Occupation" length="32" existsif=" ( salary != 0 ) " /> </group> <string name="Description" display="text-area" length="256" existsif=" ( info.bits[1] == 1 ) " />
19. JavaScript Expressions |
Go to Top |
FileCarver uses an embbedded JavaScript interpreter to process various expressions
that may be specified in file format definitions. FileCarver uses JavaScript to
evaluate the existence of fields using the existsif
attribute, to execute
update events on fields that have the onupdate
attribute set, as well
as to determine the last element of a terminated array by evaluating the JavaScript
in the termwhen
attribute.
It is also possible to embed a global set of re-usable functions in a file format definition
by specifying them in a script
tag at the beginning of the definition.
<file> <script> function convert(value) { return (value % 20 + 15); } </script> . . . </file>
There are numerous resources available on the web that explain the rules of JavaScript expressions, and therefore they are not covered in this guide. Only the specific details of their implementation in FileCarver are explained.
Accessing Field DataTo access the data inside a field, and use it in a JavaScript expression,
that field must have an id
attribute specified, which will act as a
variable name in JavaScript. For most types of fields, when you refer to the field
by its id
in a JavaScript expression, the value of that field is used.
<unsigned-int name="Salary" id="salary"/> <string name="Occupation" length="32" existsif=" ( salary != 0 ) " />
Field ids are scoped by their containing tag, such as <group>
or
<file>
. For example, if you have two tags with the same id, one in
the same group as the expression, and the other one defined in a parent scope/group,
the one in the same group takes priority.
<unsigned-int name="Outer Salary" id="salary"/> <group name="Values"> <unsigned-int name="Inner Salary" id="salary"/> <string name="Occupation" length="32" existsif=" ( salary != 0 ) " /> </group>
In the above example, the value of the "Inner Salary" field is used in the evaluation of the expression.
Additionally, fields may only refer to other fields that have been declared before them in the file definition. This is a necessary mechanism, as FileCarver needs to be able to determine if a field exists before reading it from the file, which must be done using only previously-defined fields.
If a field has been previously defined, but is located in a scope that is nested deeper than the scope of the expression, then any references to it must be done through that field's parent group.
<group name="Values" id="values"> <unsigned-int name="Salary" id="salary"/> </group> <unsigned-int name="Job Security" existsif=" ( values.salary != 0 ) " />
References to individual bits of a flag-byte
, flag-short
,
flag-int
, or flag-long
field may done by referencing the
bits
member array of that field. The index of the array corresponds
to the position
attribute on the bit
tag. The value of
a bit is either 1, if its set, or 0 otherwise.
<flag-byte name="Info" id="info"> <bit position="3" name="Has Salary" value="1"/> </flag-byte> <unsigned-int name="Salary" existsif="info.bits[3] == 1"/>
For array tags, you may query their current length (number of active elements), using the length
property.
<array name="Pencils" length="variable" termwhen=" ( pname == '' ) " id="pencils"> <string length="16" id="pname" name="Pencil Name"/> </array> <string name="Who stole my pencils?" length="variable" existsif="pencils.length == 0"/>
Note: In the above example, it is valid to reference pname
in the
termwhen
condition of the array, as that condition is only evaluated after
an element in the array is read from the file (the array is considered terminated when
an empty Pencil Name has been read).
Array elements may also be referenced by index, in order to access their children. Each array element can be considered as an implicit field group.
<array name="Pencils" length="variable" termwhen=" ( pname == '' ) " id="pencils"> <string length="16" id="pname" name="Pencil Name"/> </array> <group existsif="pencils.length == 1"> <string name="Why is the first pencil broken?" length="variable" existsif=" pencils[0].pname == 'broken' "/> </group>
Since expressions are evaluated using a standards-compliant JavaScript interpreter, they may use any features allowed in JavaScript expressions, such as boolean operators, parentheses, etc. However, since these expressions are stored in XML, it is necessary to follow XML syntax rules. Specifically the following substitutions must be made:
<
)>
)&
)"
), though unnecessary, as it is equivalent to a single quote, which is supported directlySo, conditions like "a < b"
must be written as "a < b"
and
"(a == 2) && (a == 3)"
as "(a == 2) && (a == 3)"
.
This syntax is unfortunately necessary, in order to maintain the integrity of the XML format used for
file format definitions. However, you can avoid having to subtitute the above characters if your script
resides in a global script
tag. If so, you can either enclose your script body between
<![CDATA[
and ]]>
or use the xi:include
to import it as text,
as described in the following section of the guide.
<file> <script><![CDATA[ function isValid(value) { return (value >= 0 && value < 16); } ]]></script> . . . </file>
Some additional JavaScript functions are provided by FileCarver. The functions provided are:
alert()
and prompt()
which behave like functions of the same name in
JavaScript used on web pages.
The alert()
function takes one string parameter, and displays a dialog box with
the specified message and an "OK" button.
<unsigned-int name="Min Height" id="min_h"/> <unsigned-int name="Max Height" id="max_h" onupdate=" if (this.value < min_h.value) { alert('Max Height cannot be less than Min Height.'); this.value = min_h.value; } "/>
The prompt()
function allows you to display an input prompt to the user. The
first parameter is a string which specifies the message that will be presented to the user. A
second parameter, which is optional, specifies the initial value of the input field. The
prompt()
function will return the string that the user entered, or null if the user
clicked Cancel.
<unsigned-int name="Min Height" id="min_h"/> <unsigned-int name="Max Height" id="max_h" onupdate=" while (this.value < min_h.value) { var value = prompt('Max Height cannot be less than Min Height. ' + 'Please enter new value:', min_h.value); if (value == null) { this.value = min_h.value; break; } this.value = value; } "/>
20. Including XML from Other Files |
Go to Top |
FileCarver supports the ability to modularize a file format definition into multiple separate
files, and include the XML from those other files in the main XML definition. This is done by
using the xs:include
tag with a href
attribute. The href
attribute specifies a path to a file that will be included; the path may either be relative from
the location of file containing the xs:include
tag or it may be an absolute path.
For example, given the following two files:
definitions/bookshelf.xml
:<file name="Bookshelf" xmlns:xi="http://www.w3.org/2001/XInclude"> <array length="remainder" rowname="Book"> <xi:include href="includes/book.xml"/> </array> </file>
definitions/includes/book.xml
:<group> <string name="Title" length="variable" format="terminated"/> <string name="Author" length="variable" format="terminated"/> <unsigned-short name="Number of Pages" display="spinner"/> </group>
FileCarver will then treat definitions/bookshelf.xml
as if it was:
<file name="Bookshelf"> <array length="remainder" rowname="Book"> <group> <string name="Title" length="variable" format="terminated"/> <string name="Author" length="variable" format="terminated"/> <unsigned-short name="Number of Pages" display="spinner"/> </group> </array> </file>
There are several important things to notice in the above example. First, is
necessary to specify xmlns:xi="http://www.w3.org/2001/XInclude"
as
an attribute on the file
tag, if the definition will include sub-files.
Second, each sub-file must be a well-formed XML document, and thus everything must
be nested within a single top-level tag - in this case the group
tag. While FileCarver itself does not require the extra redundant group
tag in the definition, the include mechanism does.
It is also possible for included files to include other files, and there is no limit of how deep this can go. Only circular inclusion (file A includes file B which includes, directly or indirectly, file A) is not allowed.
Finally, it is also possible to specify the parse="text"
parameter
on the xs:include
tag to include a file as plain text. This is most
useful for including external JavaScript code, which can then be written without needing to
XML-escape characters such as >
and &
.
Example:
:<file xmlns:xi="http://www.w3.org/2001/XInclude"> <script> <xi:include parse="text" href="scripts/functions.js"/> </script> . . . </file>
21. Re-usable Type Definitions |
Go to Top |
In additional to including XML from other files, FileCarver allows you define specific field
types at the top of the file format definition, and re-use these fields types throughout
the rest of the file. This is done by placing a defined-types
tag
at the top of the format definition, containing one or more type
tags. Each
type
tag must then have a unique id
attribute, and would encapsulate the
actual type definition. Then, you can refer to these global types using a type-ref
element.
<file name="Bookshelf"> <defined-types> <type id="book"> <string name="Title" length="variable" format="terminated"/> <string name="Author" length="variable" format="terminated"/> <unsigned-short name="Number of Pages" display="spinner"/> </type> </defined-types> <array length="remainder" rowname="Book"> <type-ref id="book"/> </array> </file>
This will produce the same result as if the elements in the defined-type book were declared
inside the actual array. The advantage is that you may have many type-refs
fields that
refer to the same type, without duplicated code.
An important distinction between defined types and the XML include mechanism covered in the
previous section, is that type-refs
are replaced by their corresponding types
only when they exist. That is, they will not be expanded if their or their ancestors'
existsif
condition evalutes to false. This allows for implementing nested,
recursive format definitions - where a type essentially includes itself - but conditionally.
The defined-types
must be located after a global script
tag,
if both exist in a file format definition.
22. File Filters |
Go to Top |
FileCarver supports user defined file input and output filters. An input filter allows you to specify a transformation that is to be done on the file data as it is read, before it processed by FileCarver. Meanwhile, an output filter specifies the transformation that is to be done on the data when it is saved from FileCarver, prior to it being written to disk.
Example uses of input and output filters may include performing compression or encryption on the file, or some other custom processing on the data, as required by the format.
Filters are implemented in Java, each filter being a Java class.
The input filter class must be an instance of the java.io.InputStream
class,
and must have a constructor that takes an existing InputStream
as a parameter.
Extending the java.io.FilterInputStream
class is suggested. The fully qualified
class name of the input filter class should be set as the value of the inputfilter
attribute on the file
tag.
The output filter class must be an instance of the java.io.OutputStream
class,
and must have a constructor that takes an existing OutputStream
as a parameter.
Extending the java.io.FilterOutputStream
class is suggested. The fully qualified
class name of the output filter class should be set as the value of the outputfilter
attribute on the file
tag.
Compiled filter classes may be placed in the folder filters
which should be
put in the same directory as FileCarver, in order to be available for use. Filter classes may also
come from the standard Java class library.
For example, you may use the built-in Java classes java.util.zip.GZIPInputStream
and java.util.zip.GZIPOutputStream
as input and output filters respectively, as they
conform to FileCarver's requirements for filter classes. This will allow loading and saving of
GZIP-compressed files.
<file inputfilter="java.util.zip.GZIPInputStream" outputfilter="java.util.zip.GZIPOutputStream"> ... </file>
It is not difficult to create a custom file filter as a Java class. For instance, if you wish to
create an input filter that changes all lowercase ASCII characters that are read to uppercase, you can
write a class like the following, extending java.io.FilterInputStream
and overriding
all the read
methods:
import java.io.*;
public class ToUpperInputFilter extends FilterInputStream {
public ToUpperInputFilter(InputStream in) {
super(in);
}
public int read() throws IOException {
int c = in.read();
return (c >= 'a' && c <= 'z' ? c + ('A' - 'a') : c);
}
public int read(byte[] b, int off, int len) throws IOException {
int n = super.read(b, off, len);
for (int i = off; i < off + n; i++) {
if (b[i] >= 'a' && b[i] <= 'z') {
b[i] += ('A' - 'a');
}
}
return n;
}
public int read(byte[] b) throws IOException {
return read(b, 0, b.length);
}
}
Then, compile this code, which would be saved in ToUpperInputFilter.java
, to produce
the file ToUpperInputFilter.class
. Place the class file into the filters
folder in the same directory as FileCarver (you may need to create it). This will make the ToUpperInputFilter
available for use, as shown below.
<file inputfilter="ToUpperInputFilter"> ... </file>
Please note that the above example is simplified, and does not take into consideration the format of the file. It may be necessary, for some filters, to only process specific regions of the file, which can be done by keeping track of the current offset into the file.
23. Command Line Arguments |
Go to Top |
Command line arguments may be used to modify how FileCarver behaves when it is launched.
To have FileCarver open one or more files when launched, simply specify the file paths of those files as command line arguments. Since FileCarver does not know which file format definition you wish to use to open those files, you will be presented with a prompt allowing you to select the file format definition to use for each file.
Alternatively, you may specify the path to a single file format definition which will
be used to open the data file(s) immediately. This is done by using the -d
command
line option, followed by the path to the definition file, before any data file paths.
For example, if the following command line arguments are given to FileCarver:
-d /path/to/some/file1.xml /path/to/some/other/file2.datThen, FileCarver will launch, loading only the
file1.xml
file format definition,
and will immediately open the data file file2.dat
using that file format definition.
More than one data file may be specified.
24. Environment Variables |
Go to Top |
Two environment variables can be used to control the behaviour of FileCarver: FC_DEFINITIONS_DIR and FC_DEFAULT_FILE_DIR.
The FC_DEFINITIONS_DIR environment variable specifies the path to the directory that FileCarver will use to read file format definitions from. When FC_DEFINITIONS_DIR is unspecified, FileCarver will look in a directory named 'definitions' in the same directory where FileCarver is located.
The FC_DEFAULT_FILE_DIR environment variable specifies the default directory that will be shown in an Open File or Save File dialog box with FileCarver. When this environment variable is not set, the Open File and Save File dialog boxes will start in the directory where FileCarver is located.
Setting environment variables is specific to each operating system, and is thus not covered in this guide.
25. Frequently Asked Questions |
Go to Top |
I have defined a file format for FileCarver that corresponds field-for-field to
a C struct that I am using in my program. Why is it, that when I open files that
are fine in my program with FileCarver, they do not display correctly?
There are two possibilities. First, check that the correct endianness is set. If
you are using an x86 computer (Intel, AMD, etc), then you want to set the format
attribute to 'little-endian' on the file
tag, which is the same as setting it
on every numeric field in the definition. This will ensure that fields are read with the
correct endianness.
If you are certain that you have set the endianness correctly on fields in the
file format definition, then you are most likely experiencing a problem with implicit
struct padding by your C compiler. This process involves the compiler adding hidden
variables into the struct during compilation, to make the size of the struct and the
offsets of its members convenient numbers to work with. A simple example is the C
compiler adding an extra hidden char to struct { short a; char b; }
to make it 4 bytes long, instead of 3.
With many compilers, you can set options that will make the compiler issue warnings
when this is happening. For example, if you are using GCC,
pass the flag -Wpadded
on the command line to the compiler to receive the
appropriate warnings. Once you have identified that some variables are being implicitely
added by the compiler, it is best to make them explicit in the C struct, and add them as
hidden fields to FileCarver's file definition.
I want to be able to add new fields later on to my file formats, and re-open my
old files and convert them to the new formats with extra fields. Can this be done with FileCarver?
Absolutely. Your format must have a field that specifies the current version of the file format. Then, you can use the 'existsif' attribute on the new fields that you add.
For example, suppose you have the following format:
<file> <unsigned-byte name="Format Version" id="version" value="1" editable="false"/> <int name="Field 1"/> <string name="Field 2" length="12"/> </file>Later, if you decide to add an extra field to the format, you can update the file format definition to the following:
<file> <unsigned-byte name="Format" id="version" display="list" value="2"> <option value="1">Version 1</option> <option value="2">Version 2</option> </unsigned-byte> <int name="Field 1"/> <string name="Field 2" length="12"/> <byte name="Field 3" existsif=" ( version == 2 ) "/> </file>The new definition supports both version 1 and version 2 file formats. So, how would a version 1 file be converted to version 2? Very easily. You would open the version 1 file with the new format definition and select "Version 2" from the drop-down list for the Format field. "Field 3" will appear in the user interface, where you can now enter a value. When you save the file, it will be using the new version 2 format. You're done!
FileCarver reports a java.io.UnsupportedEncodingException when loading a file
format definition containing a string field that uses a specific character
encoding. However, this character encoding is listed as supported in this guide.
How can I resolve this problem?
You will need to re-install Java on your machine and enable "support for additional languages" during the installation process.
If you are using Windows, it is possible to activate this functionality without re-installing Java by going to "Add and Remove Programs" from the Control Panel, selecting the Java Runtime Environment, clicking "Change" and activating support for additional languages through the wizard that is brought up.