\section{Yosys by example -- Synthesis} \begin{frame} \sectionpage \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Typical Phases of a Synthesis Flow} \begin{frame}{\subsecname} \begin{itemize} \item Reading and elaborating the design \item High-level synthesis and optimization \begin{itemize} \item Converting {\tt always}-blocks to logic and registers \item Perform coarse-grain optimizations (resource sharing, const folding, ...) \item Handling of memories and other coarse-grain blocks \item Extracting and optimizing finite state machines \end{itemize} \item Convert remaining logic to bit-level logic functions \item Perform optimizations on bit-level logic functions \item Map bit-level logic and register to gates from cell library \item Write results to output file \end{itemize} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Reading the design} \begin{frame}[fragile]{\subsecname} \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] read_verilog file1.v read_verilog -I include_dir -D enable_foo -D WIDTH=12 file2.v read_verilog -lib cell_library.v verilog_defaults -add -I include_dir read_verilog file3.v read_verilog file4.v verilog_defaults -clear verilog_defaults -push verilog_defaults -add -I include_dir read_verilog file5.v read_verilog file6.v verilog_defaults -pop \end{lstlisting} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Design elaboration} \begin{frame}[fragile]{\subsecname} During design elaboration Yosys figures out how the modules are hierarchically connected. It also re-runs the AST parts of the Verilog frontend to create all needed variations of parametric modules. \bigskip \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] # simplest form. at least this version should be used after reading all input files # hierarchy # recommended form. fail if parts of the design hierarchy are missing. remove # everything that is unreachable by the top module. mark the top module. # hierarchy -check -top top_module \end{lstlisting} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``proc'' commands} \begin{frame}[fragile]{\subsecname} The Verilog frontend converts {\tt always}-blocks to RTL netlists for the expressions and ``processes'' for the control- and memory elements. \medskip The {\tt proc} command transforms this ``processes'' to netlists of RTL multiplexer and register cells. \medskip The {\tt proc} command is actually a macro-command that calls the following other commands: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] proc_clean # remove empty branches and processes proc_rmdead # remove unreachable branches proc_init # special handling of "initial" blocks proc_arst # identify modeling of async resets proc_mux # convert decision trees to multiplexer networks proc_dff # extract registers from processes proc_clean # if all went fine, this should remove all the processes \end{lstlisting} \medskip Many commands can not operate on modules with ``processes'' in them. Usually a call to {\tt proc} is the first command in the actual synthesis procedure after design elaboration. \end{frame} \begin{frame}[fragile]{\subsecname{} -- Example 1/3} \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/proc_01.v} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/proc_01.ys} \end{columns} \hfil\includegraphics[width=8cm,trim=0 0cm 0 0cm]{PRESENTATION_ExSyn/proc_01.pdf} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 2/3} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm -2.5cm]{PRESENTATION_ExSyn/proc_02.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/proc_02.v} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/proc_02.ys} \end{columns} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 3/3} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm -1.5cm]{PRESENTATION_ExSyn/proc_03.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/proc_03.ys} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/proc_03.v} \end{columns} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``opt'' commands} \begin{frame}[fragile]{\subsecname} The {\tt opt} command implements a series of simple optimizations. It also is a macro command that calls other commands: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] opt_const # const folding opt_share -nomux # merging identical cells do opt_muxtree # remove never-active branches from multiplexer tree opt_reduce # consolidate trees of boolean ops to reduce functions opt_share # merging identical cells opt_rmdff # remove/simplify registers with constant inputs opt_clean # remove unused objects (cells, wires) from design opt_const # const folding while [changed design] \end{lstlisting} The command {\tt clean} can be used as alias for {\tt opt\_clean}. And {\tt ;;} can be used as shortcut for {\tt clean}. For example: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] proc; opt; memory; opt_const;; fsm;; \end{lstlisting} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 1/4} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm -0.5cm]{PRESENTATION_ExSyn/opt_01.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/opt_01.ys} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/opt_01.v} \end{columns} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 2/4} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm 0cm]{PRESENTATION_ExSyn/opt_02.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/opt_02.ys} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/opt_02.v} \end{columns} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 3/4} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm -2cm]{PRESENTATION_ExSyn/opt_03.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/opt_03.ys} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/opt_03.v} \end{columns} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 4/4} \vbox to 0cm{\hskip6cm\includegraphics[width=6cm,trim=0cm 0cm 0cm -3cm]{PRESENTATION_ExSyn/opt_04.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/opt_04.v} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/opt_04.ys} \end{columns} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{When to use ``opt'' or ``clean''} \begin{frame}{\subsecname} Usually it does not hurt to call {\tt opt} after each regular command in the synthesis script. But it increases the synthesis time, so it is favourable to only call {\tt opt} when an improvement can be archieved. \bigskip The designs in {\tt yosys-bigsim} are a good playground for experimenting with the effects of calling {\tt opt} in various places of the flow. \bigskip It generally is a good idea us call {\tt opt} before inherently expensive commands such as {\tt sat} or {\tt freduce}, as the possible gain is much higher in this cases as the possible loss. \bigskip The {\tt clean} command on the other hand is very fast and many commands leave a mess (dangling signal wires, etc). For example, most commands do not remove any wires or cells. They just change the connections and depend on a later call to clean to get rid of the now unused objects. So the occasional {\tt ;;} is a good idea in every synthesis script. \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``memory'' commands} \begin{frame}[fragile]{\subsecname} In the RTL netlist, memory reads and writes are individual cells. This makes consolidating the number of ports for a memory easier. The {\tt memory} transforms memories to an implementation. Per default that is logic for address decoders and registers. It also is a macro command that calls other commands: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] # this merges registers into the memory read- and write cells. memory_dff # this collects all read and write cells for a memory and transforms them # into one multi-port memory cell. memory_collect # this takes the multi-port memory cells and transforms it to address decoder # logic and registers. This step is skipped if "memory" is called with -nomap. memory_map \end{lstlisting} \bigskip Usually it is preferred to use architecture-specific RAM resources for memory. For example: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] memory -nomap; techmap -map my_memory_map.v; memory_map \end{lstlisting} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 1/2} \vbox to 0cm{\includegraphics[width=\linewidth,trim=0cm 0cm 0cm -10cm]{PRESENTATION_ExSyn/memory_01.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/memory_01.ys} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/memory_01.v} \end{columns} \end{frame} \begin{frame}[t, fragile]{\subsecname{} -- Example 2/2} \vbox to 0cm{\hfill\includegraphics[width=7.5cm,trim=0cm 0cm 0cm -6cm]{PRESENTATION_ExSyn/memory_02.pdf}\vss} \vskip-1cm \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{6pt}{8pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/memory_02.v} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/memory_02.ys} \end{columns} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``fsm'' commands} \begin{frame}[fragile]{\subsecname{}} The {\tt fsm} command identifies, extracts, optimizes (re-encodes), and re-synthesizes finite state machines. It again is a macro that calls a series of other commands: \begin{lstlisting}[xleftmargin=0.5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont] fsm_detect # unless got option -nodetect fsm_extract fsm_opt opt_clean fsm_opt fsm_expand # if got option -expand opt_clean # if got option -expand fsm_opt # if got option -expand fsm_recode # unless got option -norecode fsm_info fsm_export # if got option -export fsm_map # unless got option -nomap \end{lstlisting} \end{frame} \begin{frame}{\subsecname{} -- details} Some details on the most importand commands from the {\tt fsm\_*} group: \bigskip The {\tt fsm\_detect} command identifies FSM state registers and marks them with the {\tt (* fsm\_encoding = "auto" *)} attribute, if they do not have the {\tt fsm\_encoding} set already. Mark registers with {\tt (* fsm\_encoding = "none" *)} to disable FSM optimization for a register. \bigskip The {\tt fsm\_extract} command replaces the entire FSM (logic and state registers) with a {\tt \$fsm} cell. \bigskip The commands {\tt fsm\_opt} and {\tt fsm\_recode} can be used to optimize the FSM. \bigskip Finally the {\tt fsm\_map} command can be used to convert the (optimized) {\tt \$fsm} cell back to logic and registers. \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``techmap'' command} \begin{frame}[t]{\subsecname} \vbox to 0cm{\includegraphics[width=12cm,trim=-18cm 0cm 0cm -34cm]{PRESENTATION_ExSyn/techmap_01.pdf}\vss} \vskip-0.8cm The {\tt techmap} command replaces cells with an implementations given as verilog source. For example implementing a 32 bit adder using 16 bit adders: \vbox to 0cm{ \vskip-0.3cm \lstinputlisting[basicstyle=\ttfamily\fontsize{6pt}{7pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/techmap_01_map.v} }\vbox to 0cm{ \vskip-0.5cm \lstinputlisting[xleftmargin=5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single, language=verilog]{PRESENTATION_ExSyn/techmap_01.v} \lstinputlisting[xleftmargin=5cm, basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/techmap_01.ys} } \end{frame} \begin{frame}[t]{\subsecname{} -- stdcell mapping} When {\tt techmap} is used without a map file, it uses a built-in map file to map all RTL cell types to a generic library of built-in logic gates and registers. \bigskip \begin{block}{The build-in logic gate types are:} {\tt \$\_INV\_ \$\_AND\_ \$\_OR\_ \$\_XOR\_ \$\_MUX\_} \end{block} \bigskip \begin{block}{The register types are:} {\tt \$\_SR\_NN\_ \$\_SR\_NP\_ \$\_SR\_PN\_ \$\_SR\_PP\_ \\ \$\_DFF\_N\_ \$\_DFF\_P\_ \\ \$\_DFF\_NN0\_ \$\_DFF\_NN1\_ \$\_DFF\_NP0\_ \$\_DFF\_NP1\_ \\ \$\_DFF\_PN0\_ \$\_DFF\_PN1\_ \$\_DFF\_PP0\_ \$\_DFF\_PP1\_ \\ \$\_DFFSR\_NNN\_ \$\_DFFSR\_NNP\_ \$\_DFFSR\_NPN\_ \$\_DFFSR\_NPP\_ \\ \$\_DFFSR\_PNN\_ \$\_DFFSR\_PNP\_ \$\_DFFSR\_PPN\_ \$\_DFFSR\_PPP\_ \\ \$\_DLATCH\_N\_ \$\_DLATCH\_P\_} \end{block} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{The ``abc'' command} \begin{frame}{\subsecname} The {\tt abc} command provides an interface to ABC\footnote[frame]{\url{http://www.eecs.berkeley.edu/~alanmi/abc/}}, an open source tool for low-level logic synthesis. \medskip The {\tt abc} command processes a netlist of internal gate types and can perform: \begin{itemize} \item logic minimization (optimization) \item mapping of logic to standard cell library (liberty format) \item mapping of logic to k-LUTs (for FPGA synthesis) \end{itemize} \medskip Optionally {\tt abc} can process registers from one clock domain and perform sequential optimization (such as register balancing). \medskip ABC is also controlled using scripts. An ABC script can be specified to use more advanced ABC features. It is also possible to write the design with {\tt write\_blif} and load the output file into ABC outside of Yosys. \end{frame} \begin{frame}[fragile]{\subsecname{} -- Example} \begin{columns} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, language=verilog]{PRESENTATION_ExSyn/abc_01.v} \column[t]{5cm} \lstinputlisting[basicstyle=\ttfamily\fontsize{8pt}{10pt}\selectfont, frame=single]{PRESENTATION_ExSyn/abc_01.ys} \end{columns} \includegraphics[width=\linewidth,trim=0 0cm 0 0cm]{PRESENTATION_ExSyn/abc_01.pdf} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Other special-purpose mapping commands} \begin{frame}{\subsecname} \begin{block}{\tt dfflibmap} This command maps the internal register cell types to the register types described in a liberty file. \end{block} \bigskip \begin{block}{\tt hilomap} Some architectures require special driver cells for driving a constant hi or lo value. This command replaces simple constants with instances of such driver cells. \end{block} \bigskip \begin{block}{\tt iopadmap} Top-level input/outputs must usually be implemented using special I/O-pad cells. This command inserts this cells to the design. \end{block} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Example Synthesis Script} \begin{frame}[fragile]{\subsecname} \begin{columns} \column[t]{4cm} \begin{lstlisting}[basicstyle=\ttfamily\fontsize{6pt}{7pt}\selectfont] # read and elaborate design read_verilog cpu_top.v cpu_ctrl.v cpu_regs.v read_verilog -D WITH_MULT cpu_alu.v hierarchy -check -top cpu_top # high-level synthesis proc; opt; memory -nomap;; fsm; opt # substitute block rams techmap -map map_rams.v # map remaining memories memory_map # low-level synthesis techmap; opt; flatten;; abc -lut6 techmap -map map_xl_cells.v # add clock buffers select -set xl_clocks t:FDRE %x:+FDRE[C] t:FDRE %d iopadmap -inpad BUFGP O:I @xl_clocks # add io buffers select -set xl_nonclocks w:* t:BUFGP %x:+BUFGP[I] %d iopadmap -outpad OBUF I:O -inpad IBUF O:I @xl_nonclocks # write synthesis results write_edif synth.edif \end{lstlisting} \column[t]{6cm} \vskip1cm \begin{block}{Teaser / Outlook} \small\parbox{6cm}{ The weird {\tt select} expressions at the end of this script are discussed in the next part (Section 3, ``Advanced Synthesis'') of this presentation.} \end{block} \end{columns} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%