Difference between revisions of "DMA Mapping Error Analysis"

From Linux Driver Project
Jump to: navigation, search
(Created page with "I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return. Reference linux-next Aug...")
 
Line 1: Line 1:
I analyzed current calls to dma_map_single() and dma_map_page() in the kernel
+
I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return.
to see if dma mapping errors are checked after mapping routines return.
+
  
 
Reference linux-next August 6 2012.
 
Reference linux-next August 6 2012.
  
The goal of this analysis is to find drivers that don't currently check dma
+
The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows:
mapping errors and fix them. I did a grep for dma_map_single() and
+
dma_map_page() and looked at the code that calls these routines. I classified
+
the results of dma mapping error check status as follows:
+
  
 
'''Broken:'''
 
'''Broken:'''
Line 15: Line 11:
 
# Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.
 
# Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.
  
The first two categories are classified as broken and need fixing.
+
The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped
 
+
pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files.
The third one needs fixing, since it leaves dangling mapped pages, and holds
+
on to them which is equivalent to memory leak. Some drivers release all mapped
+
pages when the device closes, but others don't. Not doing unmap might be
+
harmless on some architectures going by the comments I found in some source
+
files.
+
  
 
'''Good:'''
 
'''Good:'''
Line 29: Line 20:
 
# Checks dma mapping errors with unlikely()
 
# Checks dma mapping errors with unlikely()
  
I lumped the above three cases as good cases. Using unlikely() is icing on the
+
I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area.
cake, and something we need to be concerned about compared to other problems in
+
this area.
+
  
 
'''dmap_map_single() - results'''
 
'''dmap_map_single() - results'''
Line 47: Line 36:
 
* Good: 20 (19%)
 
* Good: 20 (19%)
  
In summary a large % of the cases (> 50%) go unchecked. That raises the
+
In summary a large % of the cases (> 50%) go unchecked. That raises the following questions:
following questions:
+
  
 
* When do mapping errors get detected?
 
* When do mapping errors get detected?
Line 60: Line 48:
 
* Enhance dma-debug infrastructure to track dma map and unmap errors - I am working on this change.
 
* Enhance dma-debug infrastructure to track dma map and unmap errors - I am working on this change.
  
Detailed anaylysis that includes information on the nature of problems with dma
+
==== Detailed Analysis ====
mapping error checking, ranging from not checking mapping errors to not
+
unmapping etc. are documented - please see the following:
+
  
[[Deatailed DMA Mapping Error Analysis|Deatailed DMA Mapping Error Analysis]]
+
The following is the detailed information on the nature of problems with dma mapping error checking, ranging from not checking mapping errors to not unmapping etc.

Revision as of 17:47, 5 September 2012

mediawikimediawikimediawiki

I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return.

Reference linux-next August 6 2012.

The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows:

Broken:

  1. No error checks
  2. Partial checks - In that source file, not all calls are followed by checks.
  3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.

The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files.

Good:

  1. Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt.
  2. Checks dma mapping errors without unlikely()
  3. Checks dma mapping errors with unlikely()

I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area.

dmap_map_single() - results

  • No error checks - 195 (46%)
  • Partial checks - 46 (11%)
  • Doesn't unmap: 26 (6%)
  • Good: 147 (35%)

dma_map_page() - results

  • No error checks: 61 (59%)
  • Partial checks: 7 (.06%)
  • Doesn't unmap: 15 (14.5%)
  • Good: 20 (19%)

In summary a large % of the cases (> 50%) go unchecked. That raises the following questions:

  • When do mapping errors get detected?
  • How often do these errors occur?
  • Why don't we see failures related to missing dma mapping error checks?
  • Are they silent failures?

However I propose the following to gather more information:

  • Ehnance swiotlb or dma-debug infrastructure to track how often overflow buffer is triggered. I am working on this change.
  • Enhance dma-debug infrastructure to track dma map and unmap errors - I am working on this change.

Detailed Analysis

The following is the detailed information on the nature of problems with dma mapping error checking, ranging from not checking mapping errors to not unmapping etc.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox