Tree (data structure)
计算机可行中,使用非常广泛的一种数据结构。
wikipedia Tree (data structure)
In computer science, a tree is a widely used abstract data type (ADT)—or data structure implementing this ADT—that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.
A tree data structure can be defined recursively as a collection of nodes (starting at a root node), where each node is a data structure consisting of a value, together with a list of references to nodes (the "children"), with the constraints that no reference is duplicated, and none points to the root.
NOTE: 上述定义方法采用的是Recursive definition。
Alternatively, a tree can be defined abstractly as a whole (globally) as an ordered tree, with a value assigned to each node. Both these perspectives are useful: while a tree can be analyzed mathematically as a whole, when actually represented as a data structure it is usually represented and worked with separately by node (rather than as a set of nodes and an adjacency list of edges between nodes, as one may represent a digraph, for instance). For example, looking at a tree as a whole, one can talk about "the parent node" of a given node, but in general as a data structure a given node only contains the list of its children, but does not contain a reference to its parent (if any).
Preliminary definition
A tree is a nonlinear data structure, compared to arrays, linked lists, stacks and queues which are linear data structures. A tree can be empty with no nodes or a tree is a structure consisting of one node called the root and zero or one or more subtrees.
NOTE: 上述定义方法采用的是Recursive definition。
Mathematical definition
NOTE: 这一节是原文中最最晦涩难懂的章节了,它需要set theory的知识作为基础。那这就
其实可以简单理解,使用tree来表示集合的包含关系,这就好比是括号了。
Unordered tree
Mathematically, an unordered tree[1] (or "algebraic tree"[2]) can be defined as an algebraic structure $ (X,parent) $ where X is the non-empty carrier set of nodes and parent is a function on X which assigns each node x its "parent" node, parent(x). The structure is subject to the condition that every non-empty subalgebra must have the same fixed point. That is, there must be a unique "root" node r, such that parent(r) = r and for every node x, some iterative application parent(parent(…parent(x)…)) equals r.
NOTE :数学的定义强调的是每个node需要有一个且只有一个parent;需要注意的是,这里是数学上的定义,计算机实现的时候,往往不是定义parent,而是反向定义children。
Ordered tree
The structures introduced in the previous subsection form just the core "hierarchical" part of tree data structures that appear in computing. In most cases, there is also an additional "horizontal" ordering between siblings. In search trees the order is commonly established by the "key" or value associated with each sibling, but in many trees that is not the case. For example, XML documents, lists within JSON files, and many other structures have order that does not depend on the values in the nodes, but is itself data — sorting the paragraphs of a novel alphabetically would lose information.
Terminology used in trees
NOTE: 原文本节描述tree中的各种术语。
Representations
There are many different ways to represent trees; common representations represent the nodes as dynamically allocated records with pointers to their children, their parents, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap).
Indeed, a binary tree can be implemented as a list of lists (a list where the values are lists): the head of a list (the value of the first term) is the left child (subtree), while the tail (the list of second and subsequent terms) is the right child (subtree). This can be modified to allow values as well, as in Lisp S-expressions, where the head (value of first term) is the value of the node, the head of the tail (value of second term) is the left child, and the tail of the tail (list of third and subsequent terms) is the right child.
NOTE:
In general a node in a tree will not have pointers to its parents, but this information can be included (expanding the data structure to also include a pointer to the parent) or stored separately. Alternatively, upward links can be included in the child node data, as in a threaded binary tree.
Common operations
- Enumerating all the items
- Enumerating a section of a tree
- Searching for an item
- Adding a new item at a certain position on the tree
- Deleting an item
- Pruning: Removing a whole section of a tree
- Grafting: Adding a whole section to a tree
- Finding the root for any node
- Finding the lowest common ancestor of two nodes
Traversal and search methods
Main article: Tree traversal
Common uses
- Representing hierarchical data such as syntax trees
- Storing data in a way that makes it efficiently searchable (see binary search tree and tree traversal)
- Representing sorted lists of data
- As a workflow for compositing digital images for visual effects
- Storing Barnes-Hut trees used to simulate galaxies.
Recursive definition of tree
本文中,关于树的Recursive definition都有标注出来了。
- https://cgi.csc.liv.ac.uk/~michele/TEACHING/COMP102/2006/5.4.pdf
- http://www.montefiore.ulg.ac.be/~piater/Cours/INFO0902/notes/tree/foil04.xhtml
树的结构是具备递归性的:一个节点的左节点可能是一棵树,右节点也可能是一棵树,显然树的定义是由它自身定义的;所以对树的操作可以充分利用树结构的递归性而写出递归函数;