aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorTristan Gingold <tgingold@free.fr>2020-01-04 17:59:44 +0100
committerTristan Gingold <tgingold@free.fr>2020-01-06 18:20:28 +0100
commit09af03505bbd72f676394415c16c14bea5154513 (patch)
tree8444da62bd105b00aa2321acc3785357c550b0cb /doc
parent61c0e71793576646cb8374cd462bcda7cf6e410e (diff)
downloadghdl-09af03505bbd72f676394415c16c14bea5154513.tar.gz
ghdl-09af03505bbd72f676394415c16c14bea5154513.tar.bz2
ghdl-09af03505bbd72f676394415c16c14bea5154513.zip
doc: add internals/ (WIP). Add a part for index.
Diffstat (limited to 'doc')
-rw-r--r--doc/index.rst23
-rw-r--r--doc/internals/AST.rst95
-rw-r--r--doc/internals/Frontend.rst24
-rw-r--r--doc/internals/Overview.rst18
4 files changed, 159 insertions, 1 deletions
diff --git a/doc/index.rst b/doc/index.rst
index 7357f11ba..4229cc4ef 100644
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -102,8 +102,29 @@
:caption: Development
:hidden:
- genindex
development/Synthesis
development/Debugging
development/CodingStyle
development/Roadmap
+
+.. raw:: latex
+
+ \part{Internals}
+
+.. toctree::
+ :caption: Internals
+ :hidden:
+
+ internals/Overview
+ internals/Frontend
+ internals/AST
+
+.. raw:: latex
+
+ \part{Index}
+
+.. toctree::
+ :caption: Index
+ :hidden:
+
+ genindex
diff --git a/doc/internals/AST.rst b/doc/internals/AST.rst
new file mode 100644
index 000000000..4e77f8f3c
--- /dev/null
+++ b/doc/internals/AST.rst
@@ -0,0 +1,95 @@
+.. _INT:AST:
+
+AST
+###
+
+Introduction
+************
+
+The AST is the main data structure of the front-end and is created by the parser.
+
+AST stands for Abstract Syntax Tree.
+
+This is a tree because it is a graph with nodes and links between nodes. As the graph
+is acyclic and each node but the root has only one parent (the link that point to it).
+In the front-end there is only one root which represent the set of libraries.
+
+The tree is a syntax tree because it follows the grammar of the VHDL language: there
+is for example a node per operation (like `or`, `and` or `+`), a node per declaration,
+a node per statement, a node per design unit (like entity or architecture). The front-end needs to represent the source file using the grammar because most of the
+VHDL rules are defined according to the grammar.
+
+Finally, the tree is abstract because it is an abstraction of the source file. Comments and layout aren't kept in the syntax tree. Furthermore, if you rename a
+declaration or change the value of a literal, the tree will have exactely the same
+shape.
+
+But we can also say that the tree is neither abstract, nor syntaxic and nor a tree.
+
+It is not abstract because it contains all the information from the source file
+(except comments) are available in the AST, inclusing the location. So the source
+file can be reprinted (the name unparsed is also used) from the AST. If a mechanism
+is also added to deal with comments, the source file can even be pretty-printed from
+the AST.
+
+It is not purely syntaxic because the semantic analysis pass decorate the tree
+with semantic information. For example the type of each expression and sub-expression
+is computed. This is necessary to detect some semantic error like assigning an array
+to an integer.
+
+Finally, it is not anymore a tree because new links are added during semantic
+analysis. Simple names are linked to their declaration.
+
+The AST in GHDL
+***************
+
+The GHDL AST is described in file :file:`vhdl-nodes.ads`.
+
+An interesting particularity about the AST is the presence of a
+meta-model. The meta-model is not formally described (so, there is no
+meta-meta-model), but it is very simple: there are 3 kinds of vertices:
+
+* variable list of nodes (`List`). These are like vectors as the
+ length can be changed.
+
+* Fixed lists of nodes (`Flist`). The length of a fixed list is defined at creation.
+
+* Nodes. A node has a kind (`Iir_Kind` which is also defined in the file), and fields.
+ The kind is set at creation and cannot be changed, while fields can be.
+
+The meta-model describes the type of the fields: most of them are
+either a node reference, a boolean flag or a enumerated type (like
+`Iir_Staticness`). But there are various node references. A node can either owns
+another node, which means this is the main reference to the node; or a node can
+reference another node without owning it.
+
+Why a meta-model ?
+******************
+
+Having a meta-model allows to build algorithm that deals with any
+node. The dumper (in file :file:`vhdl-disp_tree.ad[sb]`) is used to
+dump a node and possibly its sub-nodes. This is very useful while
+debugging GHDL. It is written using the meta-model, so it knows how to display
+a boolean and the various other enumerated types, and how to display a list. To
+display a node, it just gets the kind of the type, prints the kind name and queries
+all the fields of the node. There is nothing particular to a specific kind, so you
+don't need to modify the dumper if you add a node.
+
+The dumper won't be a strong enough reason by itself to have a meta-model. But
+the pass to create instances is a good one. When a vhdl-2008 package is instantiated,
+at least the package declaration is created in the AST (this is needed because there
+are possibly new types). And creating an instance using the meta-model is much
+simpler (and much more generic) that creating the instance using directly the nodes.
+The code to create instances is in files :file:`vhdl-sem_inst.ad[sb]`.
+
+The meta-model also structures the tree. We know that each node is owned only by one node, and that each node is owned (except the top-level one). So it is possible to
+free a sub-tree. It is also possible to check that the tree is well-formed.
+
+Dealing with ownership
+**********************
+
+TBC: two fields, Is_Ref, Second_XXX; Rust & Scripts.
+
+Node Type
+*********
+
+TBC: 32-bit, extensions.
diff --git a/doc/internals/Frontend.rst b/doc/internals/Frontend.rst
new file mode 100644
index 000000000..6d5e1da5c
--- /dev/null
+++ b/doc/internals/Frontend.rst
@@ -0,0 +1,24 @@
+.. _INT:Frontend:
+
+Front-end
+#########
+
+Input files (or source files) are read by `files_map.ad[sb]`. Only regular files can be read, because they are read entirely before being scanned. This simplifies the scanner, but this also allows to have a unique index for each character in any file. Therefore the source location is a simple 32-bit integer whose type is `Location_Type`. From the location, `files_map` can deduce the source file (type is `Source_File_Entry`) and then the offset in the source file. There is a line table for each source file in order to speed-up the conversion from file offset to line number and column number.
+
+The scanner (file :file:`vhdl-scanner.ad[sb]`) reads the source files and creates token
+from them. The tokens are defined in file :file:`vhdl-tokens.ads`. Tokens are scanned
+one by one, so the scanner doesn't keep in memory the previous token. Integer or
+floating point numbers are special tokens because beside the token itself there is
+also a variable for the value of the number.
+
+For identifiers there is a table containing all identifiers. This is implemented by
+file :file:`name_table.ad[sb]`. Each identifier is associated to a 32-bit number
+(they are internalized). So the number is used to reference an identifier. About
+one thousand identifiers are predefined (by :file:`std_names.ad[sb]`). Most of
+them are reserved identifiers (or keywords). When the scanner find an identifier, it
+checks if it is a keyword. In that case it changes the token to the keyword token.
+
+The procedure `scan` is called to get the next token. The location of the token and
+the location after the token are available to store it in the parser tree.
+
+The main clieant of the scanner is the parser.
diff --git a/doc/internals/Overview.rst b/doc/internals/Overview.rst
new file mode 100644
index 000000000..3be8772b4
--- /dev/null
+++ b/doc/internals/Overview.rst
@@ -0,0 +1,18 @@
+.. _INT:Overview:
+
+Overview
+########
+
+`GHDL` is architectured like a traditionnal compiler. It has:
+
+* a driver (sources in :file:`src/ghdldrv`) to call the programs (compiler, assembler, linker) if needed.
+
+* a library (sources in :file:`src/grt`) to help execution at run-time.
+
+* a front-end (sources in :file:`src/vhdl`) to parse and analyse VHDL.
+
+* a back-end (in fact many, sources are in :file:`src/ortho`) to generate code.
+
+The architecture is modular. For example, it is possible to use the front-end in the `libghdl` library for the language server or to do synthesis (sources in :file:`src/synth`) instead of code generation.
+
+The main work is performed by the front-end, which is documented in the next chapter.