DMA Mapping Error Analysis
Revision as of 17:47, 5 September 2012 by Shuahkhan
I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return.
Reference linux-next August 6 2012.
The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows:
- No error checks
- Partial checks - In that source file, not all calls are followed by checks.
- Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.
The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files.
- Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.
- Checks dma mapping errors without unlikely()
- Checks dma mapping errors with unlikely()
I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area.
dmap_map_single() - results
- No error checks - 195 (46%)
- Partial checks - 46 (11%)
- Doesn't unmap: 26 (6%)
- Good: 147 (35%)
dma_map_page() - results
- No error checks: 61 (59%)
- Partial checks: 7 (.06%)
- Doesn't unmap: 15 (14.5%)
- Good: 20 (19%)
In summary a large % of the cases (> 50%) go unchecked. That raises the following questions:
- When do mapping errors get detected?
- How often do these errors occur?
- Why don't we see failures related to missing dma mapping error checks?
- Are they silent failures?
However I propose the following to gather more information:
- Ehnance swiotlb or dma-debug infrastructure to track how often overflow buffer is triggered. I am working on this change.
- Enhance dma-debug infrastructure to track dma map and unmap errors - I am working on this change.
The following is the detailed information on the nature of problems with dma mapping error checking, ranging from not checking mapping errors to not unmapping etc.