Tree (data structure)

计算机可行中，使用非常广泛的一种数据结构。

wikipedia Tree (data structure)

In computer science, a tree is a widely used abstract data type (ADT)—or data structure implementing this ADT—that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.

A tree data structure can be defined recursively as a collection of nodes (starting at a root node), where each node is a data structure consisting of a value, together with a list of references to nodes (the "children"), with the constraints that no reference is duplicated, and none points to the root.

NOTE: 上述定义方法采用的是Recursive definition。

Alternatively, a tree can be defined abstractly as a whole (globally) as an ordered tree, with a value assigned to each node. Both these perspectives are useful: while a tree can be analyzed mathematically as a whole, when actually represented as a data structure it is usually represented and worked with separately by node (rather than as a set of nodes and an adjacency list of edges between nodes, as one may represent a digraph, for instance). For example, looking at a tree as a whole, one can talk about "the parent node" of a given node, but in general as a data structure a given node only contains the list of its children, but does not contain a reference to its parent (if any).

Preliminary definition

A tree is a nonlinear data structure, compared to arrays, linked lists, stacks and queues which are linear data structures. A tree can be empty with no nodes or a tree is a structure consisting of one node called the root and zero or one or more subtrees.

NOTE: 上述定义方法采用的是Recursive definition。

Mathematical definition

NOTE: 这一节是原文中最最晦涩难懂的章节了，它需要set theory的知识作为基础。那这就

其实可以简单理解，使用tree来表示集合的包含关系，这就好比是括号了。

Unordered tree

Mathematically, an unordered tree[1] (or "algebraic tree"[2]) can be defined as an algebraic structure $ (X,parent) $ where X is the non-empty carrier set of nodes and parent is a function on X which assigns each node x its "parent" node, parent(x). The structure is subject to the condition that every non-empty subalgebra must have the same fixed point. That is, there must be a unique "root" node r, such that parent(r) = r and for every node x, some iterative application parent(parent(…parent(x)…)) equals r.

NOTE :数学的定义强调的是每个node需要有一个且只有一个parent；需要注意的是，这里是数学上的定义，计算机实现的时候，往往不是定义parent，而是反向定义children。

Ordered tree

The structures introduced in the previous subsection form just the core "hierarchical" part of tree data structures that appear in computing. In most cases, there is also an additional "horizontal" ordering between siblings. In search trees the order is commonly established by the "key" or value associated with each sibling, but in many trees that is not the case. For example, XML documents, lists within JSON files, and many other structures have order that does not depend on the values in the nodes, but is itself data — sorting the paragraphs of a novel alphabetically would lose information.

Terminology used in trees

NOTE: 原文本节描述tree中的各种术语。

Representations

There are many different ways to represent trees; common representations represent the nodes as dynamically allocated records with pointers to their children, their parents, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap).

Indeed, a binary tree can be implemented as a list of lists (a list where the values are lists): the head of a list (the value of the first term) is the left child (subtree), while the tail (the list of second and subsequent terms) is the right child (subtree). This can be modified to allow values as well, as in Lisp S-expressions, where the head (value of first term) is the value of the node, the head of the tail (value of second term) is the left child, and the tail of the tail (list of third and subsequent terms) is the right child.

NOTE:

In general a node in a tree will not have pointers to its parents, but this information can be included (expanding the data structure to also include a pointer to the parent) or stored separately. Alternatively, upward links can be included in the child node data, as in a threaded binary tree.

Common operations

Enumerating all the items
Enumerating a section of a tree
Searching for an item
Adding a new item at a certain position on the tree
Deleting an item
Pruning: Removing a whole section of a tree
Grafting: Adding a whole section to a tree
Finding the root for any node
Finding the lowest common ancestor of two nodes

Traversal and search methods

Main article: Tree traversal

Common uses

Representing hierarchical data such as syntax trees
Storing data in a way that makes it efficiently searchable (see binary search tree and tree traversal)
Representing sorted lists of data
As a workflow for compositing digital images for visual effects
Storing Barnes-Hut trees used to simulate galaxies.

Recursive definition of tree

本文中，关于树的Recursive definition都有标注出来了。

树的结构是具备递归性的：一个节点的左节点可能是一棵树，右节点也可能是一棵树，显然树的定义是由它自身定义的；所以对树的操作可以充分利用树结构的递归性而写出递归函数；