11. SMP Support
NOTE: at this early time we don't consider our SMP
runtime ready for prime time - it is disabled in the
current binaries - this will change.
11.1 Is SMP a good idea for your simulation?
vcomp was engineered from scratch to provide support
for simulation on multiple CPU systems. This can allow
the runtime of a simulation to be shorter.
However no matter how
hard we try and keep the CPUs away from each other synchronization
overheads always cost you more than you expect - a good
rule of thumb is that an SMP program running on dual processor
probably gets between 1.5 and 1.7 the performance of the same
program running on a single processor. So even though we're
proud of our SMP implementation we recommend that for best bang
for the buck, if you have a relatively small simulation,
you should run 2 copies at once rather than one running SMP,
you can ensure this by passing the '-Do' runtime flag to the
If you have a self-checking testbed where the second
part is a separate program (perhaps perl or C) and the runtime
of the checking portion of the testbed is comparable to the simulation
you may be able to get a similar speedup by checking on-the-fly,
perhaps by piping the simulation's output to the checker.
11.2 Making your simulation suitable for SMP speedups
A thread is a flow of control - a bunch of things waiting to be
executed. Threads are mapped to physical processors when they are executed
in an SMP system more than one thread may be mapped to a different
physical processor at the same time.
In our implementation there are a LOT of threads - for example
every always statement in every instance of a module becomes a
thread - many wires have one or more threads embedded in them,
so do assign statements. vcomp makes very light-weight threads,
light-weight even by traditional Unix standards
vcomp will never map 2 threads within the same module instance
to different physical processors at the same time. This means that
if you only have one module in your simulation you will
never see SMP speedup
In order to minimize synchronization overhead instead
of maintaining a single global event (or thread) queue vccomp
maintains many event queues (worst case one per
instanced module - but usually groups of modules
close to each other in the instance hierarchy
are collected into gangs that share an event queue).
This may mean that the ordering of events within a particular
time slice may be different from what you have seen on
Differences between simulators in the
ordering that events are scheduled are normal - the Verilog® standard
is careful not to specify 0-time event ordering other than around
constructs like non-blocking assignment (<=), 0 delays (#0), $strobe
Although vcomp implements many parallel event queues it is careful to
unify them into a global stratified queue as per the standard Verilog®
model so that the above constructs, and all well constructed Verilog®
If you do see problems moving from simulator to simulator it often indicates
as-yet-undetected race conditions in your simulation that have been masked by
a particular simulator's event ordering. Sometimes these sorts of problems
indicate a real race condition in your simulation that may also occur
in real logic - it's best to find and fix them rather than avoid them!
Our PLI subsystem is single threaded - only one thread
is allowed into it at any one time - it was designed
this way because
we didn't want to break existing PLI programs
or require the addition of synchronization primitives.
If your simulations spend lots of time in PLI calls the
overhead of waiting for other PLI calls to complete may be
so great that you see little or no SMP speed up.
When a thread stalls within a PLI call because the PLI routine
makes an IO call that waits or does a sleep() etc the other
threads may continue until they also need to make a PLI call,
or there are no more events to be scheduled - simulation time
will not advance while a thread is stalled in a PLI call.
If your simulation stalls out listening to sockets and you depend
on it to become idle you may wish to disable SMP simulation.