aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | Fix constants bound to redeclared function argsZachary Snow2020-12-262-5/+26
|/ / / | | | | | | | | | | | | | | | | | | | | | The changes in #2476 ensured that function inputs like `input x;` retained their single-bit size when instantiated with a constant argument and turned into a localparam. That change did not handle the possibility for an input to be redeclared later on with an explicit width, such as `integer x;`.
* | | Bump versionYosys Bot2020-12-241-1/+1
| | |
* | | Merge pull request #2502 from ldoolitt/masterwhitequark2020-12-231-2/+2
|\ \ \ | | | | | | | | passes/pmgen/pmgen.py: trivial change to remove C++ compiler warnings
| * | | passes/pmgen/pmgen.py: trivial change to remove C++ compiler warningsLarry Doolittle2020-12-231-2/+2
| | | | | | | | | | | | | | | | Verified that the result still builds and passes self-tests
* | | | Merge pull request #2501 from zachjs/genrtlil-tern-signwhitequark2020-12-232-4/+10
|\ \ \ \ | | | | | | | | | | genrtlil: fix mux2rtlil generated wire signedness
| * | | | genrtlil: fix mux2rtlil generated wire signednessZachary Snow2020-12-222-4/+10
| | | | |
* | | | | Merge pull request #2476 from zachjs/const-arg-widthwhitequark2020-12-232-0/+18
|\ \ \ \ \ | |_|/ / / |/| | | | Fix constants bound to single bit arguments (fixes #2383)
| * | | | Fix constants bound to single bit arguments (fixes #2383)Zachary Snow2020-12-222-0/+18
| | | | |
* | | | | Bump versionYosys Bot2020-12-231-1/+1
| |/ / / |/| | |
* | | | Merge pull request #2499 from whitequark/cxxrtl-fixeswhitequark2020-12-221-9/+10
|\ \ \ \ | | | | | | | | | | cxxrtl: don't crash generating debug information for unused wires
| * | | | cxxrtl: don't crash generating debug information for unused wires.whitequark2020-12-221-9/+10
|/ / / /
* | | | Merge pull request #2498 from StefanBruens/Fix_opt_lutwhitequark2020-12-221-2/+4
|\ \ \ \ | | | | | | | | | | Fix use-after-free in LUT opt pass
| * | | | Fix use-after-free in LUT opt passStefanBruens2020-12-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | RTLIL::Module::remove(Cell* cell) calls `delete cell`. Any subsequent accesses of `cell` then causes undefined behavior.
* | | | | Merge pull request #2497 from whitequark/cxxrtl-reflowwhitequark2020-12-222-446/+608
|\ \ \ \ \ | |/ / / / |/| | | | cxxrtl: completely rewrite netlist layout code
| * | | | cxxrtl: split processes into sync and case nodes.whitequark2020-12-221-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the treatment of black boxes, splitting processes into two scheduling nodes adds sufficient freedom so that netlists with well-behaved processes (e.g. those emitted by nMigen) can immediately converge. Because processes are not emitted into edge-triggered regions, this approach has comparable performance to -O5 (without -noproc), which is substantially slower than -O6.
| * | | | kernel: undef Tcl macros interfering with cxxrtl.whitequark2020-12-221-0/+2
| | | | |
| * | | | cxxrtl: completely rewrite netlist layout code.whitequark2020-12-221-406/+569
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The exact shape of C++ code emitted by CXXRTL has a critical effect on performance, both compile-time and runtime. CXXRTL's performance greatly improved when it started localizing and inlining wires, not only because this assists the optimizer and register allocator, but also because inlining code into edge-triggered regions cuts the time spent in eval() by at least a factor of two. However, the logic of netlist layout has always been ad-hoc, fragile, and very hard to understand and modify. After commit ece25a45, which introduced outlining, the same logic started being applied to two distinct netlists at once instead of one, which barely worked. This commit does four major changes: * There is now a single unambiguous source of truth (per subgraph) for the layout of any emitted wire. * Netlist layout is now done entirely during analysis using well known graph algorithms; no graph operations happen when emitting. * Netlist layout now happens completely separately for eval() and debug_eval() subgraphs. * Unreachable (within subgraph scope) netlist nodes are now neither emitted nor considered for wire inlining decisions. The netlist layout code should also now closely match the described semantics. As a part of this large cleanup, it includes many miscellaneous improvements: * The "bare minimum" debug level introduced in commit dd6a761d was split into two levels; -g1 now emits debug information *only* for inputs and state wires, and -g2 now emits debug information for all public members. The old behavior matches -g2. This is done to avoid bloat on low optimization levels. * Debug aliases and inlined connections are now handled separately, and complex RHS never interferes with inlined connections. * Aliases to outlined wires now carry a pointer to the outline. * Cell sync outputs can now be emitted in debug_eval(). * Black box debug information now includes comb/sync driver flags. * The comment emitted for inlined cells is now accurate. * Debug information statistics now has less noise. * Netlist layout code is now much better documented. Due to more precise inlining decisions, unmodified (i.e. with no Yosys script being used) netlists now have much more logic inlined into edge-triggered regions. On Minerva SoC SRAM, this improves runtime by 20-25% across compilers and optimization levels. Due to more precise reachability analysis, much less C++ code is now emitted, especially at the maximum debug level. On Minerva SoC SRAM, this improves clang compile time by 30-50% depending on options. gcc is not affected.
| * | | | cxxrtl: simplify logic choosing wire type. NFCI.whitequark2020-12-211-19/+8
| | | | |
| * | | | cxxrtl: clarify node use-def construction. NFCI.whitequark2020-12-211-18/+11
| | | | |
| * | | | cxxrtl: fix typo.whitequark2020-12-211-2/+2
| | | | |
* | | | | Merge pull request #2479 from zachjs/const-arg-hintwhitequark2020-12-222-0/+14
|\ \ \ \ \ | | | | | | | | | | | | Allow constant function calls in constant function arguments
| * | | | | Allow constant function calls in constant function argumentsZachary Snow2020-12-072-0/+14
| | |/ / / | |/| | |
* | | | | Merge pull request #2491 from zachjs/port-bind-signwhitequark2020-12-225-5/+132
|\ \ \ \ \ | | | | | | | | | | | | Sign extend port connections where necessary
| * | | | | Sign extend port connections where necessaryZachary Snow2020-12-185-5/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Signed cell outputs are sign extended when bound to larger wires - Signed connections are sign extended when bound to larger cell inputs - Sign extension is performed in hierarchy and flatten phases - genrtlil indirects signed constants through signed wires - Other phases producing RTLIL may need to be updated to preserve signedness information - Resolves #1418 - Resolves #2265
* | | | | | Bump versionYosys Bot2020-12-221-1/+1
| | | | | |
* | | | | | xilinx: Add some missing blackbox cells.Marcelina Kościelnicka2020-12-213-798/+6276
| | | | | |
* | | | | | xilinx: Regenerate cells_xtra.v using Vivado 2020.2Marcelina Kościelnicka2020-12-212-42/+49
| |_|/ / / |/| | | |
* | | | | Merge pull request #2496 from whitequark/cxxrtl-fixeswhitequark2020-12-213-9/+32
|\ \ \ \ \ | | | | | | | | | | | | cxxrtl: various improvements
| * | | | | cxxrtl: speed up bit repeats (sign extends, etc).whitequark2020-12-212-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Minerva SoC SRAM, depending on the compiler, this change improves overall time by 4-7%.
| * | | | | cxxrtl: speed up commits on clang.whitequark2020-12-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Minerva SoC SRAM compiled with clang-11, this change cuts commit time in half (!) and overall time by 20%. When compiled with gcc-10, there is no difference.
| * | | | | cxxrtl: use `static inline` instead of `inline` in the C API.whitequark2020-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In C, non-static inline functions require an implementation elsewhere (even though the body is right there in the header). It is basically never desirable to use those as opposed to static inline ones.
* | | | | | Bump versionYosys Bot2020-12-201-1/+1
|/ / / / /
* | | | | Merge pull request #2487 from whitequark/cxxrtl-outliningwhitequark2020-12-196-169/+415
|\ \ \ \ \ | | | | | | | | | | | | CXXRTL: implement zero-cost full coverage debug information through the magic✨ of outlining🪄🎀🧹
| * | | | | cxxrtl: print names of cells inlined in connections.whitequark2020-12-151-1/+10
| | | | | |
| * | | | | cxxrtl: disable optimization of debug_items().whitequark2020-12-152-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementing outlining has greatly increased the amount of debug information in a typical build, and consequently exposed performance issues in C++ compilers, which are similar for both GCC and Clang; the compile time of Minerva SoC SRAM increased almost twofold. Although one would expect the slowdown to be caused by the increased use of templates in `debug_eval()`, it is actually almost entirely attributable to optimizations and codegen for `debug_items()`. Fortunately, it is neither possible nor desirable to optimize `debug_items()`: in most cases it is called exactly once, and its body is a linear sequence of calls with unique arguments. This commit turns off optimizations for `debug_items()` on GCC and Clang, improving -Os compile time of Minerva SoC SRAM by ~40% (!)
| * | | | | cxxrtl: make alias analysis outlining-aware.whitequark2020-12-151-38/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this commit, if a sequence of wires assigned in a chain would terminate on a cell, none of the wires would get marked as aliases, and typically all of the public wires would get outlined. The reason for this behavior is that alias analysis predates outlining and in fact runs before it. After this commit, alias analysis runs after outlining and considers outlined wires valid aliasees. More importantly, if the chained wires contain any valid aliasees, then all of the wires are aliased to the one that is topologically deepest. Aliased wires incur virtually no overhead for the VCD writer, unlike outlined wires that would otherwise take their place. On Minerva SoC SRAM, size of the full VCD dump is reduced by ~65%, and throughput is increased by ~55%.
| * | | | | cxxrtl: add a "bare minimum" debug information level.whitequark2020-12-141-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Useful to reduce overhead when no debug capabilities are necessary except for access to design state.
| * | | | | cxxrtl: implement debug information outlining.whitequark2020-12-145-71/+278
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aggressive wire localization and inlining is necessary for CXXRTL to achieve high performance. However, that comes with a cost: reduced debug information coverage. Previously, as a workaround, the `-Og` option could have been used to guarantee complete coverage, at a cost of a significant performance penalty. This commit introduces debug information outlining. The main eval() function is compiled with the user-specified optimization settings. In tandem, an auxiliary debug_eval() function, compiled from the same netlist, can be used to reconstruct the values of localized/inlined signals on demand. To the extent that it is possible, debug_eval() reuses the results of computations performed by eval(), only filling in the missing values. Benchmarking a representative design (Minerva SoC SRAM) shows that: * Switching from `-O4`/`-Og` to `-O6` reduces runtime by ~40%. * Switching from `-g1` to `-g2`, both used with `-O6`, increases compile time by ~25%. * Although `-g2` increases the resident size of generated modules, this has no effect on runtime. Because the impact of `-g2` is minimal and the benefits of having unconditional 100% debug information coverage (and the performance improvement as well) are major, this commit removes `-Og` and changes the defaults to `-O6 -g2`. We'll have our cake and eat it too!
| * | | | | cxxrtl: rename "elision" to "inlining". NFC.whitequark2020-12-131-77/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "Elision" in this context is an unusual and not very descriptive term whereas "inlining" is common and straightforward. Also, introducing "inlining" makes it easier to introduce its dual under the obvious name "outlining".
| * | | | | cxxrtl: fix outdated comment. NFC.whitequark2020-12-131-2/+2
| | | | | |
| * | | | | cxxrtl: use IdString::isPublic(). NFC.whitequark2020-12-131-4/+4
| | | | | |
| * | | | | kernel: make IdString::isPublic() const.whitequark2020-12-121-1/+1
| | | | | |
* | | | | | Bump versionYosys Bot2020-12-181-1/+1
| | | | | |
* | | | | | xilinx: Add FDDRCPE and FDDRRSE blackbox cells.Marcelina Kościelnicka2020-12-172-0/+33
| |/ / / / |/| | | | | | | | | | | | | | | | | | | These are necessary primitives for proper DDR support on Virtex 2 and Spartan 3.
* | | | | Bump versionYosys Bot2020-12-151-1/+1
| | | | |
* | | | | timinginfo: Error instead of segfault on const signals.Marcelina Kościelnicka2020-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | Reported by @Ravenslofty
* | | | | Bump versionYosys Bot2020-12-131-1/+1
|/ / / /
* | | | Merge pull request #2485 from whitequark/cxxrtl-cell-input-bufferingwhitequark2020-12-122-25/+33
|\ \ \ \ | | | | | | | | | | cxxrtl: don't overwrite buffered inputs
| * | | | cxxrtl: don't overwrite buffered inputs.whitequark2020-12-112-25/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this commit, a cell's input was always assigned like: p_cell.p_input = (value...); If `p_input` is buffered (e.g. if the design is built at -O0), this is not correct. (In practice, this breaks clocking.) Unfortunately, the incorrect design was compiled without diagnostics because wire<> was move-assignable and also implicitly constructible from value<>. After this commit, cell inputs are no longer incorrectly assumed to always be unbuffered, and wires are not assignable from values.
* | | | | Bump versionYosys Bot2020-12-101-1/+1
| | | | |