| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aggressive wire localization and inlining is necessary for CXXRTL to
achieve high performance. However, that comes with a cost: reduced
debug information coverage. Previously, as a workaround, the `-Og`
option could have been used to guarantee complete coverage, at a cost
of a significant performance penalty.
This commit introduces debug information outlining. The main eval()
function is compiled with the user-specified optimization settings.
In tandem, an auxiliary debug_eval() function, compiled from the same
netlist, can be used to reconstruct the values of localized/inlined
signals on demand. To the extent that it is possible, debug_eval()
reuses the results of computations performed by eval(), only filling
in the missing values.
Benchmarking a representative design (Minerva SoC SRAM) shows that:
* Switching from `-O4`/`-Og` to `-O6` reduces runtime by ~40%.
* Switching from `-g1` to `-g2`, both used with `-O6`, increases
compile time by ~25%.
* Although `-g2` increases the resident size of generated modules,
this has no effect on runtime.
Because the impact of `-g2` is minimal and the benefits of having
unconditional 100% debug information coverage (and the performance
improvement as well) are major, this commit removes `-Og` and changes
the defaults to `-O6 -g2`.
We'll have our cake and eat it too!
|
|
|
|
|
|
|
| |
"Elision" in this context is an unusual and not very descriptive term
whereas "inlining" is common and straightforward. Also, introducing
"inlining" makes it easier to introduce its dual under the obvious
name "outlining".
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, a cell's input was always assigned like:
p_cell.p_input = (value...);
If `p_input` is buffered (e.g. if the design is built at -O0), this
is not correct. (In practice, this breaks clocking.) Unfortunately,
the incorrect design was compiled without diagnostics because wire<>
was move-assignable and also implicitly constructible from value<>.
After this commit, cell inputs are no longer incorrectly assumed to
always be unbuffered, and wires are not assignable from values.
|
| |
|
|\
| |
| | |
cxxrtl: use CXXRTL_ASSERT for RTL contract violations instead of assert
|
| |
| |
| |
| |
| |
| |
| |
| | |
RTL contract violations and C++ contract violations are different:
the former depend on the netlist and will never violate memory safety
whereas the latter may. When loading a CXXRTL simulation into another
process, RTL contract violations should generally not crash it, while
C++ contract violations should.
|
|\ \
| | |
| | | |
cxxrtl: fix crashes caused by a floating or constant clock input
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
E.g. in:
module test;
wire clk = 0;
reg data;
always @(posedge clk)
data <= 0;
endmodule
|
|/
|
|
|
|
|
|
|
|
|
| |
Although it is always possible to destroy and recreate the design to
simulate a power-on reset, this has two drawbacks:
* Black boxes are also destroyed and recreated, which causes them
to reacquire their resources, which might be costly and/or erase
important state.
* Pointers into the design are invalidated and have to be acquired
again, which is costly and might be very inconvenient if they are
captured elsewhere (especially through the C API).
|
|
|
|
|
|
|
|
|
|
|
|
| |
In most cases, a CXXRTL simulation would use a top module, either
because this module serves as an entry point to the CXXRTL C API,
or because the outputs of a top module are unbuffered, improving
performance. Taking this into account, the CXXRTL backend now runs
`hierarchy -auto-top` if there is no top module. For the few cases
where this behavior is unwanted, it now accepts a `-nohierarchy`
option.
Fixes #2373.
|
|
|
|
| |
Fixes #2129.
|
|
|
|
| |
Fixes #2374.
|
|
|
|
|
|
| |
This can be useful to determine whether the wire should be a part of
a design checkpoint, whether it can be used to override design state,
and whether driving it may cause a conflict.
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, the meaning of "sync def" included some flip-flop
cells but not others. There was no actual reason for this; it was
just poorly defined.
After this commit, a "sync def" means that a wire holds design state
because it is connected directly to a flip-flop output, and may never
be unbuffered. This is not affected by presence of async inputs.
|
|
|
|
|
|
| |
This can be useful to distinguish e.g. a combinatorially driven wire
with type `CXXRTL_VALUE` from a module input with the same type, as
well as general introspection.
|
| |
|
|
|
|
|
| |
Nodes driven by a constant value have type CXXRTL_VALUE and their
`next` pointer set to NULL. (This is already documented.)
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This bug was hidden if a header was generated.
|
| |
|
| |
|
|\
| |
| | |
Use C++11 final/override/[[noreturn]]
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
For several reasons:
* They're more convenient than accessing .data.
* They accommodate variably-sized types like size_t transparently.
* They statically ensure that no out of range conversions happen.
For now these are only provided for unsigned integers, but eventually
they should be provided for signed integers too. (Annoyingly this
affects conversions to/from `char` at the moment.)
Fixes #2127.
|
|\
| |
| | |
cxxrtl: don't compute vital values in log_assert()
|
| |
| |
| |
| |
| |
| | |
This breaks NDEBUG builds.
Fixes #2166.
|
|\ \
| | |
| | | |
cxxrtl: restrict the debug info of a blackbox to its ports.
|
| |/ |
|
|\ \
| |/
|/| |
cxxrtl: avoid unused variable warning for transparent $memrd ports
|
| | |
|
|\ \
| |/
|/| |
cxxrtl: Implement chunk-wise multiplication
|
| | |
|
|\ \
| |/
|/| |
cxxrtl: fix sshr sign-extension.
|
| | |
|
|\ \
| |/
|/| |
cxxrtl: fix rzext()
|
| |
| |
| |
| |
| |
| |
| | |
This was a correctness issue, but one of the consequences is that it
resulted in jumps in generated machine code where there should have
been none. As a side effect of fixing the bug, Minerva SoC became 10%
faster.
|
|\ \
| | |
| | | |
cxxrtl: handle multipart signals
|
| | |
| | |
| | |
| | | |
This avoids losing design visibility when using the `splitnets` pass.
|
| | | |
|
| |/
|/|
| |
| |
| |
| | |
This can result in massive reduction in runtime, up to 50% depending
on workload. Currently people are using `-mllvm -inline-threshold=`
as a workaround (with clang++), but this solution is more portable.
|
| |
| |
| |
| |
| | |
On Minerva, this improves runtime by around 10%, mostly by ensuring
that the logic driving FFs is packed into edge conditionals.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Without unbuffering output wires of, at least, toplevel modules, it
is not possible to have most designs that rely on IO via toplevel
ports (as opposed to using exclusively blackboxes) converge within
one delta cycle. That seriously impairs the performance of CXXRTL.
This commit avoids unbuffering outputs of all modules solely so that
in future, CXXRTL could gain fully separate compilation, and not for
any present technical reason.
|
|/
|
|
| |
This also fixes an edge case with (*keep*) input ports.
|
|\
| |
| | |
cxxrtl: various compiler compatibility fixes
|