66 lines
		
	
	
		
			3.8 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			66 lines
		
	
	
		
			3.8 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| ========================
 | |
| SoundWire Error Handling
 | |
| ========================
 | |
| 
 | |
| The SoundWire PHY was designed with care and errors on the bus are going to
 | |
| be very unlikely, and if they happen it should be limited to single bit
 | |
| errors. Examples of this design can be found in the synchronization
 | |
| mechanism (sync loss after two errors) and short CRCs used for the Bulk
 | |
| Register Access.
 | |
| 
 | |
| The errors can be detected with multiple mechanisms:
 | |
| 
 | |
| 1. Bus clash or parity errors: This mechanism relies on low-level detectors
 | |
|    that are independent of the payload and usages, and they cover both control
 | |
|    and audio data. The current implementation only logs such errors.
 | |
|    Improvements could be invalidating an entire programming sequence and
 | |
|    restarting from a known position. In the case of such errors outside of a
 | |
|    control/command sequence, there is no concealment or recovery for audio
 | |
|    data enabled by the SoundWire protocol, the location of the error will also
 | |
|    impact its audibility (most-significant bits will be more impacted in PCM),
 | |
|    and after a number of such errors are detected the bus might be reset. Note
 | |
|    that bus clashes due to programming errors (two streams using the same bit
 | |
|    slots) or electrical issues during the transmit/receive transition cannot
 | |
|    be distinguished, although a recurring bus clash when audio is enabled is a
 | |
|    indication of a bus allocation issue. The interrupt mechanism can also help
 | |
|    identify Slaves which detected a Bus Clash or a Parity Error, but they may
 | |
|    not be responsible for the errors so resetting them individually is not a
 | |
|    viable recovery strategy.
 | |
| 
 | |
| 2. Command status: Each command is associated with a status, which only
 | |
|    covers transmission of the data between devices. The ACK status indicates
 | |
|    that the command was received and will be executed by the end of the
 | |
|    current frame. A NAK indicates that the command was in error and will not
 | |
|    be applied. In case of a bad programming (command sent to non-existent
 | |
|    Slave or to a non-implemented register) or electrical issue, no response
 | |
|    signals the command was ignored. Some Master implementations allow for a
 | |
|    command to be retransmitted several times.  If the retransmission fails,
 | |
|    backtracking and restarting the entire programming sequence might be a
 | |
|    solution. Alternatively some implementations might directly issue a bus
 | |
|    reset and re-enumerate all devices.
 | |
| 
 | |
| 3. Timeouts: In a number of cases such as ChannelPrepare or
 | |
|    ClockStopPrepare, the bus driver is supposed to poll a register field until
 | |
|    it transitions to a NotFinished value of zero. The MIPI SoundWire spec 1.1
 | |
|    does not define timeouts but the MIPI SoundWire DisCo document adds
 | |
|    recommendation on timeouts. If such configurations do not complete, the
 | |
|    driver will return a -ETIMEOUT. Such timeouts are symptoms of a faulty
 | |
|    Slave device and are likely impossible to recover from.
 | |
| 
 | |
| Errors during global reconfiguration sequences are extremely difficult to
 | |
| handle:
 | |
| 
 | |
| 1. BankSwitch: An error during the last command issuing a BankSwitch is
 | |
|    difficult to backtrack from. Retransmitting the Bank Switch command may be
 | |
|    possible in a single segment setup, but this can lead to synchronization
 | |
|    problems when enabling multiple bus segments (a command with side effects
 | |
|    such as frame reconfiguration would be handled at different times). A global
 | |
|    hard-reset might be the best solution.
 | |
| 
 | |
| Note that SoundWire does not provide a mechanism to detect illegal values
 | |
| written in valid registers. In a number of cases the standard even mentions
 | |
| that the Slave might behave in implementation-defined ways. The bus
 | |
| implementation does not provide a recovery mechanism for such errors, Slave
 | |
| or Master driver implementers are responsible for writing valid values in
 | |
| valid registers and implement additional range checking if needed.
 |