Appendix: Performance, timing accuracy, and Max settings 
-------------------------------------------------------------------

*NOTE: I should probably move this to main docs. eh?*

The purpose of this appendix is to look in detail timing, latency, performance,
and Max settings. First off, let me assure you that the sequencing code has been tested
exhaustively, and running live sequencers with rock solid timing in Scheme for Max 
absolutely works. That's the good news! If you're willing to pay the CPU cost
you can even get sample accurate timing. 
However, there is overhead to running Scheme, so understanding your options
for balancing timing accuracy, latency, and performance is worthwhile.

Like other dynamic languages such as Python, Ruby, JavaScript, Common Lisp, the s7 Scheme 
interpreter runs a garbage collector (GC). The GC runs occasionally, sweeping through
allocated memory, deleting unused memory references. This is what makes it possible
for us to program without manually allocating and managing memory the way we need to
in languages like C, C++, or Rust. 
On my system, the GC typically takes between 0.5 and 1.5 milliseconds to run, 
but depending on your code and CPU it could take longer. 
If the GC can't finish in time, we get a missed deadline and
our timing will slip - looking at recorded output will show the output getting behind
the correct time. If you have the Max audio setting for Audio in Interrupt selected,
and Max is making sound, you'll probably hear an audio underun too, as this setting
forces the Scheduler thread (the one running s4m) and the audio dsp thread to share
timeslices. If you don't have Audio in Interupt selected, the timing will slip a bit 
but you won't hear an audio click issue.

This means that for super accurate timing, we need to
do two things: run the GC frequently so that it always does it's job quickly, and run 
Max with enough latency that the GC running makes no difference to the timing. 
What sufficient latency is will depend somewhat on what else your machine is doing, 
both in Max, and out of Max, and some of your Max settings. 

The first thing I do is hook up a metronome at about 100ms (experiment!) to a message
box with a **gc** message and send this to s4m inlet 0. This ensures the gc is called
every 100 ms. The interpreter thus runs the gc very frequently, ensuring it doesn't 
have too much to do on each pass.

The Max **I/O Vector Size** is the most important setting. In order to get
bang on accuracy, we need this big enough for the GC to finish running. This is
also the setting that produces the latency of Max to your sound output. A setting
of 512 translates to about 11ms at 44100 sample rate, while 256 is 5.8ms. This is 
ample time, *if Max isn't eating up that time already on audio*. On my machine, I can
run with anywhere from 128 to 1024 on this setting, depending on how much I'm taxing
the CPU, and the recorded output stays accurate to within a ms. If you don't mind
more slop in the timing, you can lower this and increase the Max **scheduler slop**
setting, trading short term accuracy (slop) for CPU use and long term accuracy. This
setting lets Max run the scheduler a bit late but then catches up later. If you have
Max producing no other audio, you can likely get this down to 128 and still get accurate
timing. 

The Max **Signal Vector Size** setting (in Audio Status) is also important to understand.
This determines how many samples of audio are calculated per audio rendering pass. 
If you have Audio Interrupt selected, this will determine how frequently the scheduler
can run. The timing of your scheduler-generated (i.e. s4m) events can only be
as accurate on a small scale as this setting allows. If you want actual *sample accurate*
timing, this needs to be 1! If Max is also making audio, reducing this number increases
the CPU load of Max, and thus requires you to raise the I/O Vector Size. 
So if you need exact attack times to line up with audio generated elsewhere, you should
experiment with lowering this number until you are satisfied. 

I've noticed a few things that might be helpful.

* Max is not good at hosting VST instruments. If you want to use VSTs and run at
  low latency you should probably pipe midi to a DAW such as Live. I get much better
  performance running the VSTs in live and using a virtual midi driver.
* If you don't need sample accurate timing, you might want to run with Audio Interrupt
  off and more Scheduler Slop. For a live use where s4m is doing all the timing,
  and you absolutely can't chance an audio underrun, this might be appropriate.
* Reducing any visual updates from audio (such as VU meters in the live.gain object)
  dramatically improves performance, allowing lower latency.
* If you have lots of GUI elements doing things in the Max low priority thread, you
  might want to lower the servicing of the low priority thread and the refresh rate
  in your max settings. (Event Interval, Redraw Queue Throttle, and Refresh Rate).
* If you make a very large Scheme program, you might want to split it into a low and
  high priority instance. For example, if you want to drive a large bank of GUI elements,
  that can all be done in a low priority thread, and you can use messages between Scheme
  instances or an intermediate data store such as a buffer to pass data between them.

The best thing to do is to experiment with these settings, recording the output, and
take a look in your audio editor.