MUZAK58

From SizeCoding
Revision as of 14:27, 13 February 2024 by WiRe (talk | contribs) (Music Sequence Table)

Jump to: navigation, search

MUZAK58 was created by wiRe/Napalm and is 58 bytes in size. It achieved 4th place at the Lovebyte 2024 demoparty and is a pure tech demo of a size-optimized bytebeat player for MSDOS and COVOX LPT-DAC, as also used in other sizecoding releases by wiRe. This page will describe how this player works and how it can be adopted for other releases. Feel free to reuse those ideas and techniques in your own productions, but please give a credit to wiRe then. Before you continue, make sure to read Bytebeat and Output#Producing_sound for the basics.

COVOX LPT-DAC

The COVOX LPT-DAC, also called Disney Sound, is an 8-bit digital-to-analog converter (DAC) connected to the 8 data-out-lines of a parallel printer port (LPT). Typically it was assembled using a simple R-2R resistor ladder to perform the conversion into an analog signal, why it was very cheap to build such hardware device on your own those days. Playing back an 8-bit sample, like it is the output of a bytebeat algorithm, through COVOX LPT DAC is a very easy task. Assuming the next sample value is inside register AL, then this is all you must do:

          mov     dx, 0378h                 ;BA7803     ;load LPT1 port address into DX
          out     dx, al                    ;EE         ;send 8-bit sample data to COVOX device

HINT: It is also possible to have 2 COVOX adapters, e.g. connected to LPT1 and LPT2, and send out two samples in parallel for stereo output, one sample for the left and one for the right channel.

But for a good audio playback quality, the time between two LPT1 writes should match the sampling rate pretty well. Also the bytebeat algorithm will require a time counter as input, which reflects the current sample number. This is why we need a good time source.

Time Source

For playing data through the COVOX LPTDAC, we need a pretty accurate timer. Typical sampling rate would be 8 kHz, but also higher values could be used. Lower values may also work in some special cases. There are multiple possible options to get such a time counter:

  • Timer Interrupt
  • Poll BIOS Counter
  • Poll PIT Counter
  • Alternative Options

Timer Interrupt

As described here: Output#Advanced_PC_Speaker_and_COVOX_sound_via_interrupt. While this is the most accurate way to drive the COVOX, it is very likely also the most expensive one. Setting up the new interrupt handler (let's even ignore the backup and restore of the old handler), the overhead of the handler itself and the problem of exchanging any counters between handler and firmware. All this will cost bytes. In most cases it will require less size if the timer is polled instead, like in all other variants described next. But it must be also clear that the polling approach will make it necessary to perform this task at a higher frequency than the actual samplingrate, i.e. 8kHz. This requires the polling to happen inside an inner loop, e.g. after each pixel update.

Poll BIOS Counter

The DWORD at [0:0x046C] holds the number of BIOS timer ticks. Typically INT8 will run at a frequency of 18.2 Hz. On each call the default interrupt handler will increase this value by 1. Reusing this default handler will prevent the costs for setting up an own handler just to implement the counter increment logic. So, a simple solution to get a timer counter is to reconfigure the PIT for an 8kHz rate. This will trigger the default INT8 handler 8000 times per second. Then this counter can be polled periodically inside the inner loop. Once its LSB changes, another sample must be generated, also using this counter value as sample counter, and sent to LPT1. This could look like this:

          mov     al, 149                   ;B095       ;program PIT #0 to ~8kHz (1.19318181818 MHz / 149 = 8007.93 Hz)
          out     40h, al                   ;E640       ;
          salc                              ;D6         ;  AL := 0 (if CF=0)
          out     40h, al                   ;E640       ;

          ; ...

suplp:    mov     al, [046Ch]               ;A0xxxx     ;load LSB from BIOS timer, assuming DS=0
_tcmp:    cmp     al, 0x55                  ;2C??       ;did timer value changed?
          jz      ntick                     ;74xx       ;  no -> skip audio
          mov     [_tcmp+1], al             ;A2xxxx     ;remember last BIOS timer value (selfmodifying code)

          inc     ebp                       ;6645       ;increment 32-bit timer counter

          ; ... set AL to next audio sample based on EBP ...

          mov     dx, 0378h                 ;BA7803     ;load LPT1 port address into DX
          out     dx, al                    ;EE         ;send 8-bit sample data to COVOX device

ntick:
          ; ...

          jmp     short suplp

HINT: Instead incrementing your own 16- or 32-bit timer counter (EBP inside the above example) someone could also use the BIOS timer counter itself, located at DWORD [0:0x046C]. Whatever fits better.

Poll PIT Counter

(...more soon...)

Alternative Options

Another option to get an accurate time is to read the processor's time-stamp counter using the RDTSC instruction.

The Bytebeat

Bytebeat is simply spoken an 8-bit uncompressed audiowave stream at any fixed sampling rate, that is expressed by a single, more or less complex, mathematical function f(t), where t is the number of the sample. It will start generation the first sample for t=0 and, in case of an 8kHz samplingrate, will reach the sample f(8000) after exactly 1 second.

In general, any bytebeat algorithm can be implemented now to present the next sample inside register AL, as it is required by the previous example code. But in respect to the size, a bytebeat algorithm is better suited if it's formula is as simple as possible, implementation-wise. The method used to achieve a size-optimized, but still flexible bytebeat formula is described next.

Music Sequence Table

MUZAK58 is to some degree a generic background music player. Of course it is also possible to modify the player itself to change the music style (more on that later), but the source of the played music comes from sequence tables stored in memory. Changing those words results in an entirely new music tune being played. Also spending more words for this table will vary the tune, that it will not repeat as fast.

The basic idea is to design the bytebeat algorithm as a loop that performs always the same math-operations to achieve the smallest possible size. Following this concept, we break down the bytebeat function f(t) into N smaller terms gN(t). Then function f(t) is comprised of N terms multiplied together like this: f(t) = Π(gN(t))/D = g0(t) * g1(t) * g2(t) * ... * gN-1(t) / D. On each loop iteration of the final bytebeat player, gN(t) is evaluated and it's result is multiplied to the total result of f(t).

The function g0 has a special meaning compared to the other terms, and will deliver the base waveform for the final music tune. To keep the code short, this function will deliver a saw-tooth or triangle waveform based on t. Both can be implemented with very few instructions. All other terms g1..gN will translate t into factors that will modify the frequency of this waveform. Those translations will be done through N different sequences, one sequence table per term.

HINT: Instead of only using one modulated waveform, also 2 or more can be used, like i.e.: f(t) = (Π(gN(t)) + Π(hN(t))) / D

In the reference implementation, each sequence table was chosen to store 4 sequence steps or "notes" (S=4). Which note to select is based on 2-bits of parameter t. Again considering the size, each sequence i=0..N-1 is using the lookup index (t>>(i+O))&3, where O as start offset can be chosen different for each music tune.

Putting this all together, we can now start composing one music tune this way:

  static constexpr unsigned O = 10;
  static constexpr unsigned N = 5;
  static constexpr unsigned S = 4;
  static constexpr unsigned D = 8;

  static constexpr uint8_t seqtbl[N][S] = { {3,1,4,1}, {6,6,12,6}, {2,4,2,2}, {5,9,4,6}, {4,8,4,4} };

  uint8_t get_sample( uint16_t t ) {
    for( unsigned i = 0; i < N; i++ ) t *= seqtbl[i][(t>>(i+O))&3];
    return static_cast<uint8_t>(t / D);
  }

The above code can also be executed here: bytebeat.demozoo.org

(...more soon...)