Previous: Attributes, Up: Trees
The internal representation for expressions is for the most part quite straightforward. However, there are a few facts that one must bear in mind. In particular, the expression “tree” is actually a directed acyclic graph. (For example there may be many references to the integer constant zero throughout the source program; many of these will be represented by the same expression node.) You should not rely on certain kinds of node being shared, nor should rely on certain kinds of nodes being unshared.
The following macros can be used with all expression nodes:
TREE_TYPEIn what follows, some nodes that one might expect to always have type
bool are documented to have either integral or boolean type. At
some point in the future, the C front end may also make use of this same
intermediate representation, and at this point these nodes will
certainly have integral type. The previous sentence is not meant to
imply that the C++ front end does not or will not give these nodes
integral type.
Below, we list the various kinds of expression nodes. Except where
noted otherwise, the operands to an expression are accessed using the
TREE_OPERAND macro. For example, to access the first operand to
a binary plus expression expr, use:
TREE_OPERAND (expr, 0)
As this example indicates, the operands are zero-indexed.
All the expressions starting with OMP_ represent directives and
clauses used by the OpenMP API http://www.openmp.org/.
The table below begins with constants, moves on to unary expressions, then proceeds to binary expressions, and concludes with various other kinds of expressions:
INTEGER_CSTTREE_TYPE; they are not always of type
int. In particular, char constants are represented with
INTEGER_CST nodes. The value of the integer constant e is
given by
((TREE_INT_CST_HIGH (e) << HOST_BITS_PER_WIDE_INT)
+ TREE_INST_CST_LOW (e))
HOST_BITS_PER_WIDE_INT is at least thirty-two on all platforms. Both
TREE_INT_CST_HIGH and TREE_INT_CST_LOW return a
HOST_WIDE_INT. The value of an INTEGER_CST is interpreted
as a signed or unsigned quantity depending on the type of the constant.
In general, the expression given above will overflow, so it should not
be used to calculate the value of the constant.
The variable integer_zero_node is an integer constant with value
zero. Similarly, integer_one_node is an integer constant with
value one. The size_zero_node and size_one_node variables
are analogous, but have type size_t rather than int.
The function tree_int_cst_lt is a predicate which holds if its
first argument is less than its second. Both constants are assumed to
have the same signedness (i.e., either both should be signed or both
should be unsigned.) The full width of the constant is used when doing
the comparison; the usual rules about promotions and conversions are
ignored. Similarly, tree_int_cst_equal holds if the two
constants are equal. The tree_int_cst_sgn function returns the
sign of a constant. The value is 1, 0, or -1
according on whether the constant is greater than, equal to, or less
than zero. Again, the signedness of the constant's type is taken into
account; an unsigned constant is never less than zero, no matter what
its bit-pattern.
REAL_CSTCOMPLEX_CST__complex__ whose parts are constant nodes. The
TREE_REALPART and TREE_IMAGPART return the real and the
imaginary parts respectively.
VECTOR_CSTTREE_LIST of the
constant nodes and is accessed through TREE_VECTOR_CST_ELTS.
STRING_CSTTREE_STRING_LENGTH
returns the length of the string, as an int. The
TREE_STRING_POINTER is a char* containing the string
itself. The string may not be NUL-terminated, and it may contain
embedded NUL characters. Therefore, the
TREE_STRING_LENGTH includes the trailing NUL if it is
present.
For wide string constants, the TREE_STRING_LENGTH is the number
of bytes in the string, and the TREE_STRING_POINTER
points to an array of the bytes of the string, as represented on the
target system (that is, as integers in the target endianness). Wide and
non-wide string constants are distinguished only by the TREE_TYPE
of the STRING_CST.
FIXME: The formats of string constants are not well-defined when the
target system bytes are not the same width as host system bytes.
PTRMEM_CSTPTRMEM_CST_CLASS is the class type (either a RECORD_TYPE
or UNION_TYPE within which the pointer points), and the
PTRMEM_CST_MEMBER is the declaration for the pointed to object.
Note that the DECL_CONTEXT for the PTRMEM_CST_MEMBER is in
general different from the PTRMEM_CST_CLASS. For example,
given:
struct B { int i; };
struct D : public B {};
int D::*dp = &D::i;
The PTRMEM_CST_CLASS for &D::i is D, even though
the DECL_CONTEXT for the PTRMEM_CST_MEMBER is B,
since B::i is a member of B, not D.
VAR_DECLNEGATE_EXPRThe behavior of this operation on signed arithmetic overflow is
controlled by the flag_wrapv and flag_trapv variables.
ABS_EXPRabs, labs and llabs builtins for
integer types, and the fabs, fabsf and fabsl
builtins for floating point types. The type of abs operation can
be determined by looking at the type of the expression.
This node is not used for complex types. To represent the modulus
or complex abs of a complex value, use the BUILT_IN_CABS,
BUILT_IN_CABSF or BUILT_IN_CABSL builtins, as used
to implement the C99 cabs, cabsf and cabsl
built-in functions.
BIT_NOT_EXPRTRUTH_NOT_EXPRBOOLEAN_TYPE
or INTEGER_TYPE.
PREDECREMENT_EXPRPREINCREMENT_EXPRPOSTDECREMENT_EXPRPOSTINCREMENT_EXPRPREDECREMENT_EXPR and
PREINCREMENT_EXPR, the value of the expression is the value
resulting after the increment or decrement; in the case of
POSTDECREMENT_EXPR and POSTINCREMENT_EXPR is the value
before the increment or decrement occurs. The type of the operand, like
that of the result, will be either integral, boolean, or floating-point.
ADDR_EXPRAs an extension, GCC allows users to take the address of a label. In
this case, the operand of the ADDR_EXPR will be a
LABEL_DECL. The type of such an expression is void*.
If the object addressed is not an lvalue, a temporary is created, and
the address of the temporary is used.
INDIRECT_REFFIX_TRUNC_EXPRFLOAT_EXPRFIXME: How is the operand supposed to be rounded? Is this dependent on
-mieee?
COMPLEX_EXPRCONJ_EXPRREALPART_EXPRIMAGPART_EXPRNON_LVALUE_EXPRNOP_EXPRchar* to an
int* does not require any code be generated; such a conversion is
represented by a NOP_EXPR. The single operand is the expression
to be converted. The conversion from a pointer to a reference is also
represented with a NOP_EXPR.
CONVERT_EXPRNOP_EXPRs, but are used in those
situations where code may need to be generated. For example, if an
int* is converted to an int code may need to be generated
on some platforms. These nodes are never used for C++-specific
conversions, like conversions between pointers to different classes in
an inheritance hierarchy. Any adjustments that need to be made in such
cases are always indicated explicitly. Similarly, a user-defined
conversion is never represented by a CONVERT_EXPR; instead, the
function calls are made explicit.
THROW_EXPRthrow expressions. The single operand is
an expression for the code that should be executed to throw the
exception. However, there is one implicit action not represented in
that expression; namely the call to __throw. This function takes
no arguments. If setjmp/longjmp exceptions are used, the
function __sjthrow is called instead. The normal GCC back end
uses the function emit_throw to generate this code; you can
examine this function to see what needs to be done.
LSHIFT_EXPRRSHIFT_EXPRBIT_IOR_EXPRBIT_XOR_EXPRBIT_AND_EXPRTRUTH_ANDIF_EXPRTRUTH_ORIF_EXPRBOOLEAN_TYPE or INTEGER_TYPE.
TRUTH_AND_EXPRTRUTH_OR_EXPRTRUTH_XOR_EXPRBOOLEAN_TYPE or INTEGER_TYPE.
PLUS_EXPRMINUS_EXPRMULT_EXPRThe behavior of these operations on signed arithmetic overflow is
controlled by the flag_wrapv and flag_trapv variables.
RDIV_EXPRTRUNC_DIV_EXPRFLOOR_DIV_EXPRCEIL_DIV_EXPRROUND_DIV_EXPRTRUNC_DIV_EXPR rounds towards zero, FLOOR_DIV_EXPR
rounds towards negative infinity, CEIL_DIV_EXPR rounds towards
positive infinity and ROUND_DIV_EXPR rounds to the closest integer.
Integer division in C and C++ is truncating, i.e. TRUNC_DIV_EXPR.
The behavior of these operations on signed arithmetic overflow, when
dividing the minimum signed integer by minus one, is controlled by the
flag_wrapv and flag_trapv variables.
TRUNC_MOD_EXPRFLOOR_MOD_EXPRCEIL_MOD_EXPRROUND_MOD_EXPRa and b is
defined as a - (a/b)*b where the division calculated using
the corresponding division operator. Hence for TRUNC_MOD_EXPR
this definition assumes division using truncation towards zero, i.e.
TRUNC_DIV_EXPR. Integer remainder in C and C++ uses truncating
division, i.e. TRUNC_MOD_EXPR.
EXACT_DIV_EXPREXACT_DIV_EXPR code is used to represent integer divisions where
the numerator is known to be an exact multiple of the denominator. This
allows the backend to choose between the faster of TRUNC_DIV_EXPR,
CEIL_DIV_EXPR and FLOOR_DIV_EXPR for the current target.
ARRAY_REFarray_ref_low_bound and array_ref_element_size
instead.
ARRAY_RANGE_REFARRAY_REF and have the same
meanings. The type of these expressions must be an array whose component
type is the same as that of the first operand. The range of that array
type determines the amount of data these expressions access.
TARGET_MEM_REFTMR_SYMBOL and must be a VAR_DECL of an object with
a fixed address. The second argument is TMR_BASE and the
third one is TMR_INDEX. The fourth argument is
TMR_STEP and must be an INTEGER_CST. The fifth
argument is TMR_OFFSET and must be an INTEGER_CST.
Any of the arguments may be NULL if the appropriate component
does not appear in the address. Address of the TARGET_MEM_REF
is determined in the following way.
&TMR_SYMBOL + TMR_BASE + TMR_INDEX * TMR_STEP + TMR_OFFSET
The sixth argument is the reference to the original memory access, which
is preserved for the purposes of the RTL alias analysis. The seventh
argument is a tag representing the results of tree level alias analysis.
LT_EXPRLE_EXPRGT_EXPRGE_EXPREQ_EXPRNE_EXPRFor floating point comparisons, if we honor IEEE NaNs and either operand
is NaN, then NE_EXPR always returns true and the remaining operators
always return false. On some targets, comparisons against an IEEE NaN,
other than equality and inequality, may generate a floating point exception.
ORDERED_EXPRUNORDERED_EXPRUNLT_EXPRUNLE_EXPRUNGT_EXPRUNGE_EXPRUNEQ_EXPRLTGT_EXPRUNLT_EXPR returns true if either operand is an IEEE
NaN or the first operand is less than the second. With the possible
exception of LTGT_EXPR, all of these operations are guaranteed
not to generate a floating point exception. The result
type of these expressions will always be of integral or boolean type.
These operations return the result type's zero value for false,
and the result type's one value for true.
MODIFY_EXPRVAR_DECL, INDIRECT_REF, COMPONENT_REF, or
other lvalue.
These nodes are used to represent not only assignment with `=' but
also compound assignments (like `+='), by reduction to `='
assignment. In other words, the representation for `i += 3' looks
just like that for `i = i + 3'.
INIT_EXPRMODIFY_EXPR, but are used only when a
variable is initialized, rather than assigned to subsequently. This
means that we can assume that the target of the initialization is not
used in computing its own value; any reference to the lhs in computing
the rhs is undefined.
COMPONENT_REFFIELD_DECL for the data member. The third operand represents
the byte offset of the field, but should not be used directly; call
component_ref_field_offset instead.
COMPOUND_EXPRCOND_EXPR?: expressions. The first operand
is of boolean or integral type. If it evaluates to a nonzero value,
the second operand should be evaluated, and returned as the value of the
expression. Otherwise, the third operand is evaluated, and returned as
the value of the expression.
The second operand must have the same type as the entire expression,
unless it unconditionally throws an exception or calls a noreturn
function, in which case it should have void type. The same constraints
apply to the third operand. This allows array bounds checks to be
represented conveniently as (i >= 0 && i < 10) ? i : abort().
As a GNU extension, the C language front-ends allow the second
operand of the ?: operator may be omitted in the source.
For example, x ? : 3 is equivalent to x ? x : 3,
assuming that x is an expression without side-effects.
In the tree representation, however, the second operand is always
present, possibly protected by SAVE_EXPR if the first
argument does cause side-effects.
CALL_EXPRPOINTER_TYPE. The second argument is a TREE_LIST. The
arguments to the call appear left-to-right in the list. The
TREE_VALUE of each list node contains the expression
corresponding to that argument. (The value of TREE_PURPOSE for
these nodes is unspecified, and should be ignored.) For non-static
member functions, there will be an operand corresponding to the
this pointer. There will always be expressions corresponding to
all of the arguments, even if the function is declared with default
arguments and some arguments are not explicitly provided at the call
sites.
STMT_EXPR int f() { return ({ int j; j = 3; j + 7; }); }
In other words, an sequence of statements may occur where a single
expression would normally appear. The STMT_EXPR node represents
such an expression. The STMT_EXPR_STMT gives the statement
contained in the expression. The value of the expression is the value
of the last sub-statement in the body. More precisely, the value is the
value computed by the last statement nested inside BIND_EXPR,
TRY_FINALLY_EXPR, or TRY_CATCH_EXPR. For example, in:
({ 3; })
the value is 3 while in:
({ if (x) { 3; } })
there is no value. If the STMT_EXPR does not yield a value,
it's type will be void.
BIND_EXPRTREE_CHAIN field. These will
never require cleanups. The scope of these variables is just the body
of the BIND_EXPR. The body of the BIND_EXPR is the
second operand.
LOOP_EXPRLOOP_EXPR_BODY
represents the body of the loop. It should be executed forever, unless
an EXIT_EXPR is encountered.
EXIT_EXPRLOOP_EXPR. The single operand is the condition; if it is
nonzero, then the loop should be exited. An EXIT_EXPR will only
appear within a LOOP_EXPR.
CLEANUP_POINT_EXPRCONSTRUCTORTREE_LIST. If the TREE_TYPE of the
CONSTRUCTOR is a RECORD_TYPE or UNION_TYPE, then
the TREE_PURPOSE of each node in the TREE_LIST will be a
FIELD_DECL and the TREE_VALUE of each node will be the
expression used to initialize that field.
If the TREE_TYPE of the CONSTRUCTOR is an
ARRAY_TYPE, then the TREE_PURPOSE of each element in the
TREE_LIST will be an INTEGER_CST or a RANGE_EXPR of
two INTEGER_CSTs. A single INTEGER_CST indicates which
element of the array (indexed from zero) is being assigned to. A
RANGE_EXPR indicates an inclusive range of elements to
initialize. In both cases the TREE_VALUE is the corresponding
initializer. It is re-evaluated for each element of a
RANGE_EXPR. If the TREE_PURPOSE is NULL_TREE, then
the initializer is for the next available array element.
In the front end, you should not depend on the fields appearing in any
particular order. However, in the middle end, fields must appear in
declaration order. You should not assume that all fields will be
represented. Unrepresented fields will be set to zero.
COMPOUND_LITERAL_EXPRCOMPOUND_LITERAL_EXPR_DECL_STMT is a DECL_STMT
containing an anonymous VAR_DECL for
the unnamed object represented by the compound literal; the
DECL_INITIAL of that VAR_DECL is a CONSTRUCTOR
representing the brace-enclosed list of initializers in the compound
literal. That anonymous VAR_DECL can also be accessed directly
by the COMPOUND_LITERAL_EXPR_DECL macro.
SAVE_EXPRSAVE_EXPR represents an expression (possibly involving
side-effects) that is used more than once. The side-effects should
occur only the first time the expression is evaluated. Subsequent uses
should just reuse the computed value. The first operand to the
SAVE_EXPR is the expression to evaluate. The side-effects should
be executed where the SAVE_EXPR is first encountered in a
depth-first preorder traversal of the expression tree.
TARGET_EXPRTARGET_EXPR represents a temporary object. The first operand
is a VAR_DECL for the temporary variable. The second operand is
the initializer for the temporary. The initializer is evaluated and,
if non-void, copied (bitwise) into the temporary. If the initializer
is void, that means that it will perform the initialization itself.
Often, a TARGET_EXPR occurs on the right-hand side of an
assignment, or as the second operand to a comma-expression which is
itself the right-hand side of an assignment, etc. In this case, we say
that the TARGET_EXPR is “normal”; otherwise, we say it is
“orphaned”. For a normal TARGET_EXPR the temporary variable
should be treated as an alias for the left-hand side of the assignment,
rather than as a new temporary variable.
The third operand to the TARGET_EXPR, if present, is a
cleanup-expression (i.e., destructor call) for the temporary. If this
expression is orphaned, then this expression must be executed when the
statement containing this expression is complete. These cleanups must
always be executed in the order opposite to that in which they were
encountered. Note that if a temporary is created on one branch of a
conditional operator (i.e., in the second or third operand to a
COND_EXPR), the cleanup must be run only if that branch is
actually executed.
See STMT_IS_FULL_EXPR_P for more information about running these
cleanups.
AGGR_INIT_EXPRAGGR_INIT_EXPR represents the initialization as the return
value of a function call, or as the result of a constructor. An
AGGR_INIT_EXPR will only appear as a full-expression, or as the
second operand of a TARGET_EXPR. The first operand to the
AGGR_INIT_EXPR is the address of a function to call, just as in
a CALL_EXPR. The second operand are the arguments to pass that
function, as a TREE_LIST, again in a manner similar to that of
a CALL_EXPR.
If AGGR_INIT_VIA_CTOR_P holds of the AGGR_INIT_EXPR, then
the initialization is via a constructor call. The address of the third
operand of the AGGR_INIT_EXPR, which is always a VAR_DECL,
is taken, and this value replaces the first argument in the argument
list.
In either case, the expression is void.
VA_ARG_EXPRva_arg (ap, type).
Its TREE_TYPE yields the tree representation for type and
its sole argument yields the representation for ap.
OMP_PARALLEL#pragma omp parallel [clause1 ... clauseN]. It
has four operands:
Operand OMP_PARALLEL_BODY is valid while in GENERIC and
High GIMPLE forms. It contains the body of code to be executed
by all the threads. During GIMPLE lowering, this operand becomes
NULL and the body is emitted linearly after
OMP_PARALLEL.
Operand OMP_PARALLEL_CLAUSES is the list of clauses
associated with the directive.
Operand OMP_PARALLEL_FN is created by
pass_lower_omp, it contains the FUNCTION_DECL
for the function that will contain the body of the parallel
region.
Operand OMP_PARALLEL_DATA_ARG is also created by
pass_lower_omp. If there are shared variables to be
communicated to the children threads, this operand will contain
the VAR_DECL that contains all the shared values and
variables.
OMP_FOR#pragma omp for [clause1 ... clauseN]. It
has 5 operands:
Operand OMP_FOR_BODY contains the loop body.
Operand OMP_FOR_CLAUSES is the list of clauses
associated with the directive.
Operand OMP_FOR_INIT is the loop initialization code of
the form VAR = N1.
Operand OMP_FOR_COND is the loop conditional expression
of the form VAR {<,>,<=,>=} N2.
Operand OMP_FOR_INCR is the loop index increment of the
form VAR {+=,-=} INCR.
Operand OMP_FOR_PRE_BODY contains side-effect code from
operands OMP_FOR_INIT, OMP_FOR_COND and
OMP_FOR_INC. These side-effects are part of the
OMP_FOR block but must be evaluated before the start of
loop body.
The loop index variable VAR must be a signed integer variable,
which is implicitly private to each thread. Bounds
N1 and N2 and the increment expression
INCR are required to be loop invariant integer
expressions that are evaluated without any synchronization. The
evaluation order, frequency of evaluation and side-effects are
unspecified by the standard.
OMP_SECTIONS#pragma omp sections [clause1 ... clauseN].
Operand OMP_SECTIONS_BODY contains the sections body,
which in turn contains a set of OMP_SECTION nodes for
each of the concurrent sections delimited by #pragma omp
section.
Operand OMP_SECTIONS_CLAUSES is the list of clauses
associated with the directive.
OMP_SECTIONOMP_SECTIONS.
OMP_SINGLE#pragma omp single.
Operand OMP_SINGLE_BODY contains the body of code to be
executed by a single thread.
Operand OMP_SINGLE_CLAUSES is the list of clauses
associated with the directive.
OMP_MASTER#pragma omp master.
Operand OMP_MASTER_BODY contains the body of code to be
executed by the master thread.
OMP_ORDERED#pragma omp ordered.
Operand OMP_ORDERED_BODY contains the body of code to be
executed in the sequential order dictated by the loop index
variable.
OMP_CRITICAL#pragma omp critical [name].
Operand OMP_CRITICAL_BODY is the critical section.
Operand OMP_CRITICAL_NAME is an optional identifier to
label the critical section.
OMP_RETURNtree-cfg.c) and OpenMP region
building code (omp-low.c).
OMP_CONTINUEOMP_FOR and
OMP_SECTIONS to mark the place where the code needs to
loop to the next iteration (in the case of OMP_FOR) or
the next section (in the case of OMP_SECTIONS).
In some cases, OMP_CONTINUE is placed right before
OMP_RETURN. But if there are cleanups that need to
occur right after the looping body, it will be emitted between
OMP_CONTINUE and OMP_RETURN.
OMP_ATOMIC#pragma omp atomic.
Operand 0 is the address at which the atomic operation is to be performed.
Operand 1 is the expression to evaluate. The gimplifier tries
three alternative code generation strategies. Whenever possible,
an atomic update built-in is used. If that fails, a
compare-and-swap loop is attempted. If that also fails, a
regular critical section around the expression is used.
OMP_CLAUSEOMP_ directives.
Clauses are represented by separate sub-codes defined in
tree.h. Clauses codes can be one of:
OMP_CLAUSE_PRIVATE, OMP_CLAUSE_SHARED,
OMP_CLAUSE_FIRSTPRIVATE,
OMP_CLAUSE_LASTPRIVATE, OMP_CLAUSE_COPYIN,
OMP_CLAUSE_COPYPRIVATE, OMP_CLAUSE_IF,
OMP_CLAUSE_NUM_THREADS, OMP_CLAUSE_SCHEDULE,
OMP_CLAUSE_NOWAIT, OMP_CLAUSE_ORDERED,
OMP_CLAUSE_DEFAULT, and OMP_CLAUSE_REDUCTION. Each code
represents the corresponding OpenMP clause.
Clauses associated with the same directive are chained together
via OMP_CLAUSE_CHAIN. Those clauses that accept a list
of variables are restricted to exactly one, accessed with
OMP_CLAUSE_VAR. Therefore, multiple variables under the
same clause C need to be represented as multiple C clauses
chained together. This facilitates adding new clauses during
compilation.